-
DocuMint: Docstring Generation for Python using Small Language Models
Authors:
Bibek Poudel,
Adam Cook,
Sekou Traore,
Shelah Ameli
Abstract:
Effective communication, specifically through documentation, is the beating heart of collaboration among contributors in software development. Recent advancements in language models (LMs) have enabled the introduction of a new type of actor in that ecosystem: LM-powered assistants capable of code generation, optimization, and maintenance. Our study investigates the efficacy of small language model…
▽ More
Effective communication, specifically through documentation, is the beating heart of collaboration among contributors in software development. Recent advancements in language models (LMs) have enabled the introduction of a new type of actor in that ecosystem: LM-powered assistants capable of code generation, optimization, and maintenance. Our study investigates the efficacy of small language models (SLMs) for generating high-quality docstrings by assessing accuracy, conciseness, and clarity, benchmarking performance quantitatively through mathematical formulas and qualitatively through human evaluation using Likert scale. Further, we introduce DocuMint, as a large-scale supervised fine-tuning dataset with 100,000 samples. In quantitative experiments, Llama 3 8B achieved the best performance across all metrics, with conciseness and clarity scores of 0.605 and 64.88, respectively. However, under human evaluation, CodeGemma 7B achieved the highest overall score with an average of 8.3 out of 10 across all metrics. Fine-tuning the CodeGemma 2B model using the DocuMint dataset led to significant improvements in performance across all metrics, with gains of up to 22.5% in conciseness. The fine-tuned model and the dataset can be found in HuggingFace and the code can be found in the repository.
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
Optimizing the cloud? Don't train models. Build oracles!
Authors:
Tiemo Bang,
Conor Power,
Siavash Ameli,
Natacha Crooks,
Joseph M. Hellerstein
Abstract:
We propose cloud oracles, an alternative to machine learning for online optimization of cloud configurations. Our cloud oracle approach guarantees complete accuracy and explainability of decisions for problems that can be formulated as parametric convex optimizations. We give experimental evidence of this technique's efficacy and share a vision of research directions for expanding its applicabilit…
▽ More
We propose cloud oracles, an alternative to machine learning for online optimization of cloud configurations. Our cloud oracle approach guarantees complete accuracy and explainability of decisions for problems that can be formulated as parametric convex optimizations. We give experimental evidence of this technique's efficacy and share a vision of research directions for expanding its applicability.
△ Less
Submitted 22 December, 2023; v1 submitted 13 August, 2023;
originally announced August 2023.
-
Analog Feedback-Controlled Memristor programming Circuit for analog Content Addressable Memory
Authors:
Jiaao Yu,
Paul-Philipp Manea,
Sara Ameli,
Mohammad Hizzani,
Amro Eldebiky,
John Paul Strachan
Abstract:
Recent breakthroughs in associative memories suggest that silicon memories are coming closer to human memories, especially for memristive Content Addressable Memories (CAMs) which are capable to read and write in analog values. However, the Program-Verify algorithm, the state-of-the-art memristor programming algorithm, requires frequent switching between verifying and programming memristor conduct…
▽ More
Recent breakthroughs in associative memories suggest that silicon memories are coming closer to human memories, especially for memristive Content Addressable Memories (CAMs) which are capable to read and write in analog values. However, the Program-Verify algorithm, the state-of-the-art memristor programming algorithm, requires frequent switching between verifying and programming memristor conductance, which brings many defects such as high dynamic power and long programming time. Here, we propose an analog feedback-controlled memristor programming circuit that makes use of a novel look-up table-based (LUT-based) programming algorithm. With the proposed algorithm, the programming and the verification of a memristor can be performed in a single-direction sequential process. Besides, we also integrated a single proposed programming circuit with eight analog CAM (aCAM) cells to build an aCAM array. We present SPICE simulations on TSMC 28nm process. The theoretical analysis shows that 1. A memristor conductance within an aCAM cell can be converted to an output boundary voltage in aCAM searching operations and 2. An output boundary voltage in aCAM searching operations can be converted to a programming data line voltage in aCAM programming operations. The simulation results of the proposed programming circuit prove the theoretical analysis and thus verify the feasibility to program memristors without frequently switching between verifying and programming the conductance. Besides, the simulation results of the proposed aCAM array show that the proposed programming circuit can be integrated into a large array architecture.
△ Less
Submitted 21 April, 2023;
originally announced April 2023.
-
Development of an Efficient and Flexible Pipeline for Lagrangian Coherent Structure Computation
Authors:
Siavash Ameli,
Yogin Desai,
Shawn C. Shadden
Abstract:
The computation of Lagrangian coherent structures (LCS) has become a standard tool for the analysis of advective transport in unsteady flow applications. LCS identification is primarily accomplished by evaluating measures based on the finite-time Cauchy Green (CG) strain tensor over the fluid domain. Sampling the CG tensor requires the advection of large numbers of fluid tracers, which can be comp…
▽ More
The computation of Lagrangian coherent structures (LCS) has become a standard tool for the analysis of advective transport in unsteady flow applications. LCS identification is primarily accomplished by evaluating measures based on the finite-time Cauchy Green (CG) strain tensor over the fluid domain. Sampling the CG tensor requires the advection of large numbers of fluid tracers, which can be computationally intensive, but presents a large degree of data parallelism. Processing can be specialized to parallel computing architectures, but on the other hand, there is compelling need for robust and flexible implementations for end users. Specifically, code that can accommodate analysis of wide-ranging fluid mechanics applications, while using a modular structure that is easily extended or modified, and facilitates visualization is desirable. We discuss the use of Visualization Toolkit (VTK) libraries as a foundation for object-oriented LCS computation, and how this framework can facilitate integration of LCS computation into flow visualization software such as ParaView. We also discuss the development of CUDA GPU kernels for efficient parallel spatial sampling of the flow map, including optimizing these kernels for better utilization.
△ Less
Submitted 27 September, 2022;
originally announced September 2022.
-
A Singular Woodbury and Pseudo-Determinant Matrix Identities and Application to Gaussian Process Regression
Authors:
Siavash Ameli,
Shawn C. Shadden
Abstract:
We study a matrix that arises from a singular form of the Woodbury matrix identity. We present generalized inverse and pseudo-determinant identities for this matrix, which have direct applications for Gaussian process regression, specifically its likelihood representation and precision matrix. We extend the definition of the precision matrix to the Bott-Duffin inverse of the covariance matrix, pre…
▽ More
We study a matrix that arises from a singular form of the Woodbury matrix identity. We present generalized inverse and pseudo-determinant identities for this matrix, which have direct applications for Gaussian process regression, specifically its likelihood representation and precision matrix. We extend the definition of the precision matrix to the Bott-Duffin inverse of the covariance matrix, preserving properties related to conditional independence, conditional precision, and marginal precision. We also provide an efficient algorithm and numerical analysis for the presented determinant identities and demonstrate their advantages under specific conditions relevant to computing log-determinant terms in likelihood functions of Gaussian process regression.
△ Less
Submitted 24 April, 2023; v1 submitted 16 July, 2022;
originally announced July 2022.
-
Noise Estimation in Gaussian Process Regression
Authors:
Siavash Ameli,
Shawn C. Shadden
Abstract:
We develop a computational procedure to estimate the covariance hyperparameters for semiparametric Gaussian process regression models with additive noise. Namely, the presented method can be used to efficiently estimate the variance of the correlated error, and the variance of the noise based on maximizing a marginal likelihood function. Our method involves suitably reducing the dimensionality of…
▽ More
We develop a computational procedure to estimate the covariance hyperparameters for semiparametric Gaussian process regression models with additive noise. Namely, the presented method can be used to efficiently estimate the variance of the correlated error, and the variance of the noise based on maximizing a marginal likelihood function. Our method involves suitably reducing the dimensionality of the hyperparameter space to simplify the estimation procedure to a univariate root-finding problem. Moreover, we derive bounds and asymptotes of the marginal likelihood function and its derivatives, which are useful to narrowing the initial range of the hyperparameter search. Using numerical examples, we demonstrate the computational advantages and robustness of the presented approach compared to traditional parameter optimization.
△ Less
Submitted 20 June, 2022;
originally announced June 2022.
-
Hierarchical Robust Adaptive Control for Wind Turbines with Actuator Fault
Authors:
Sina Ameli,
Olugbenga Moses Anubi
Abstract:
This paper solves the problem of regulating the rotor speed tracking error for wind turbines in the full-load region by an effective robust-adaptive control strategy. The developed controller compensates for the uncertainty in the control input effectiveness caused by a pitch actuator fault, unmeasurable wind disturbance, and nonlinearity in the model. Wind turbines have multi-layer structures suc…
▽ More
This paper solves the problem of regulating the rotor speed tracking error for wind turbines in the full-load region by an effective robust-adaptive control strategy. The developed controller compensates for the uncertainty in the control input effectiveness caused by a pitch actuator fault, unmeasurable wind disturbance, and nonlinearity in the model. Wind turbines have multi-layer structures such that the high-level structure is nonlinearly coupled through an aggregation of the low-level control authorities. Hence, the control design is divided into two stages. First, an $\mathcal{L}_2$ controller is designed to attenuate the influence of wind disturbance fluctuations on the rotor speed. Then, in the low-level layer, a controller is designed using a proposed adaptation mechanism to compensate for actuator faults. The theoretical results show that the closed-loop equilibrium point of the regulated rotor speed tracking error dynamics in the high level is finite-gain $\mathcal{L}_2$ stable, and the closed-loop error dynamics in the low level is globally asymptotically stable. Simulation results show that the developed controller significantly reduces the root mean square of the rotor speed error compared to some well-known works, despite the largely fluctuating wind disturbance, and the time-varying uncertainty in the control input effectiveness.
△ Less
Submitted 13 September, 2021;
originally announced September 2021.
-
Low dimensional behaviour of generalized Kuramoto model
Authors:
Sara Ameli,
Keivan Aghababaei Samani
Abstract:
We study the global bifurcations of frequency weighted Kuramoto model in low-dimension for network of fully connected oscillators. To study the effect of non-zero-centered frequency distribution, we consider two symmetric Lorentzians as an example. We derive the stability diagram of the system and show that the infinite-dimensional problem reduces to a flow in four dimensions. Using the system sym…
▽ More
We study the global bifurcations of frequency weighted Kuramoto model in low-dimension for network of fully connected oscillators. To study the effect of non-zero-centered frequency distribution, we consider two symmetric Lorentzians as an example. We derive the stability diagram of the system and show that the infinite-dimensional problem reduces to a flow in four dimensions. Using the system symmetries, it can be further reduced to two dimensions. Using this analytic framework, we obtain bifurcation boundaries of the system, which is compatible with our numeric simulations. We show that the system has three types of transitions to synchronized state for different parameters of the frequency distribution: (1) a two-step transition, representative of standing waves, (2) a continuous transition, as in the classical Kuramoto model, and (3) a first-order transition with hysteresis. Numerical simulations are also conducted to confirm analytic results.
△ Less
Submitted 4 September, 2021;
originally announced September 2021.
-
Robust Control for a Class of Nonlinearly Coupled Hierarchical Systems with Actuator Faults
Authors:
Sina Ameli,
Olugbenga Moses Anubi
Abstract:
This paper proposes an approach to addresses the control challenges posed by a fault-induced uncertainty in both the dynamics and control input effectiveness of a class of hierarchical nonlinear systems in which the high-level dynamics is nonlinearly coupled with a multi-agent low-level dynamics. The high-level dynamics has a multiplicative uncertainty in the control input effectiveness and is sub…
▽ More
This paper proposes an approach to addresses the control challenges posed by a fault-induced uncertainty in both the dynamics and control input effectiveness of a class of hierarchical nonlinear systems in which the high-level dynamics is nonlinearly coupled with a multi-agent low-level dynamics. The high-level dynamics has a multiplicative uncertainty in the control input effectiveness and is subjected to an exogenous disturbance input. On the other hand, the low-level system is subjected to actuator faults causing a time-varying multiplicative uncertainty in the dynamical model and associated control effectiveness. Moreover, the nonlinear coupling between the high-level and the low-level dynamics makes the problem even more challenging. To address this problem, an online parameter estimation algorithm is designed, coupled with an adaptive splitting mechanism which automatically distributes the control action among low level multi-agent systems. A nonlinear $\mathcal{L}_2$-gain-based controller, and then a state-feedback controller are designed in the high-level, and the low-level, respectively, to recover the system from faults with high performance in the transient response, and reject the exogenous disturbance. The resulting analysis guarantees a robust tracking of the high-level reference command signal.
△ Less
Submitted 6 August, 2021; v1 submitted 28 June, 2021;
originally announced June 2021.
-
Mott Memristors based on Field-Induced Carrier Avalanche Multiplication
Authors:
Francesco Peronaci,
Sara Ameli,
Shintaro Takayoshi,
Alexandra Landsman,
Takashi Oka
Abstract:
We present a theory of Mott memristors whose working principle is the non-linear carrier avalanche multiplication in Mott insulators subject to strong electric fields. The internal state of the memristor, which determines its resistance, is encoded in the density of doublon and hole excitations in the Mott insulator. In the current-voltage characteristic, insulating and conducting states are separ…
▽ More
We present a theory of Mott memristors whose working principle is the non-linear carrier avalanche multiplication in Mott insulators subject to strong electric fields. The internal state of the memristor, which determines its resistance, is encoded in the density of doublon and hole excitations in the Mott insulator. In the current-voltage characteristic, insulating and conducting states are separated by a negative-differential-resistance region, leading to hysteretic behavior. Under oscillating voltage, the response of a voltage-controlled, non-polar memristive system is obtained, with retarded current and pinched hysteresis loop. As a first step towards neuromorphic applications, we demonstrate self-sustained spiking oscillations in a circuit with a parallel capacitor. Being based on electronic excitations only, this memristor is up to several orders of magnitude faster than previous proposals relying on Joule heating or ionic drift.
△ Less
Submitted 1 April, 2021;
originally announced April 2021.
-
Interpolating Log-Determinant and Trace of the Powers of Matrix $\mathbf{A} + t \mathbf{B}$
Authors:
Siavash Ameli,
Shawn C. Shadden
Abstract:
We develop heuristic interpolation methods for the functions $t \mapsto \log \det \left( \mathbf{A} + t \mathbf{B} \right)$ and $t \mapsto \operatorname{trace}\left( (\mathbf{A} + t \mathbf{B})^{p} \right)$ where the matrices $\mathbf{A}$ and $\mathbf{B}$ are Hermitian and positive (semi) definite and $p$ and $t$ are real variables. These functions are featured in many applications in statistics,…
▽ More
We develop heuristic interpolation methods for the functions $t \mapsto \log \det \left( \mathbf{A} + t \mathbf{B} \right)$ and $t \mapsto \operatorname{trace}\left( (\mathbf{A} + t \mathbf{B})^{p} \right)$ where the matrices $\mathbf{A}$ and $\mathbf{B}$ are Hermitian and positive (semi) definite and $p$ and $t$ are real variables. These functions are featured in many applications in statistics, machine learning, and computational physics. The presented interpolation functions are based on the modification of sharp bounds for these functions. We demonstrate the accuracy and performance of the proposed method with numerical examples, namely, the marginal maximum likelihood estimation for Gaussian process regression and the estimation of the regularization parameter of ridge regression with the generalized cross-validation method.
△ Less
Submitted 3 August, 2022; v1 submitted 15 September, 2020;
originally announced September 2020.
-
A transport method for restoring incomplete ocean current measurements
Authors:
Siavash Ameli,
Shawn C. Shadden
Abstract:
Remote sensing of oceanographic data often yields incomplete coverage of the measurement domain. This can limit interpretability of the data and identification of coherent features informative of ocean dynamics. Several methods exist to fill gaps of missing oceanographic data, and are often based on projecting the measurements onto basis functions or a statistical model. Herein, we use an informat…
▽ More
Remote sensing of oceanographic data often yields incomplete coverage of the measurement domain. This can limit interpretability of the data and identification of coherent features informative of ocean dynamics. Several methods exist to fill gaps of missing oceanographic data, and are often based on projecting the measurements onto basis functions or a statistical model. Herein, we use an information transport approach inspired from an image processing algorithm. This approach aims to restore gaps in data by advecting and diffusing information of features as opposed to the field itself. Since this method does not involve fitting or projection, the portions of the domain containing measurements can remain unaltered, and the method offers control over the extent of local information transfer. This method is applied to measurements of ocean surface currents by high frequency radars. This is a relevant application because data coverage can be sporadic and filling data gaps can be essential to data usability. Application to two regions with differing spatial scale is considered. The accuracy and robustness of the method is tested by systematically blinding measurements and comparing the restored data at these locations to the actual measurements. These results demonstrate that even for locally large percentages of missing data points, the restored velocities have errors within the native error of the original data (e.g., $<10$% for velocity magnitude and $<3$% for velocity direction). Results were relatively insensitive to model parameters, facilitating a priori selection of default parameters for de novo applications.
△ Less
Submitted 23 August, 2018;
originally announced August 2018.
-
The effects of noise and time delay on the synchronization of the Kuramoto model in small-world networks
Authors:
Sara Ameli,
Farhad Shahbazi,
Maryam Karimian,
Tahereh Malakoutikhah
Abstract:
We study the synchronization of a small-world network of identical coupled phase oscillators with Kuramoto interaction. First, we consider the model with instantaneous mutual interaction and the normalized coupling constant to the degree of each node. For this model, similar to the constant coupling studied before, we find the existence of various attractors corresponding to the different defect p…
▽ More
We study the synchronization of a small-world network of identical coupled phase oscillators with Kuramoto interaction. First, we consider the model with instantaneous mutual interaction and the normalized coupling constant to the degree of each node. For this model, similar to the constant coupling studied before, we find the existence of various attractors corresponding to the different defect patterns and also the noise enhanced synchronization when driven by an external uncorrelated white noise. We also investigate the synchronization of the model with homogenous time-delay in the phase couplings. For a given intrinsic frequency and coupling constant, upon varying the time delay we observe the existence a partially synchronized state with defect patterns which transforms to an incoherent phase characterized by randomly phase locked states. By further increasing of the time delay, this phase again undergoes a transition to another patterned partially synchronized state. We show that the transition between theses phases are discontinuous and moreover in each phase the average frequency of the oscillators decreases by increasing the time delay and shows an upward jump at the transition points.
△ Less
Submitted 22 May, 2017;
originally announced May 2017.