-
An Empirical Categorization of Prompting Techniques for Large Language Models: A Practitioner's Guide
Authors:
Oluwole Fagbohun,
Rachel M. Harrison,
Anton Dereventsov
Abstract:
Due to rapid advancements in the development of Large Language Models (LLMs), programming these models with prompts has recently gained significant attention. However, the sheer number of available prompt engineering techniques creates an overwhelming landscape for practitioners looking to utilize these tools. For the most efficient and effective use of LLMs, it is important to compile a comprehen…
▽ More
Due to rapid advancements in the development of Large Language Models (LLMs), programming these models with prompts has recently gained significant attention. However, the sheer number of available prompt engineering techniques creates an overwhelming landscape for practitioners looking to utilize these tools. For the most efficient and effective use of LLMs, it is important to compile a comprehensive list of prompting techniques and establish a standardized, interdisciplinary categorization framework. In this survey, we examine some of the most well-known prompting techniques from both academic and practical viewpoints and classify them into seven distinct categories. We present an overview of each category, aiming to clarify their unique contributions and showcase their practical applications in real-world examples in order to equip fellow practitioners with a structured framework for understanding and categorizing prompting techniques tailored to their specific domains. We believe that this approach will help simplify the complex landscape of prompt engineering and enable more effective utilization of LLMs in various applications. By providing practitioners with a systematic approach to prompt categorization, we aim to assist in navigating the intricacies of effective prompt design for conversational pre-trained LLMs and inspire new possibilities in their respective fields.
△ Less
Submitted 18 February, 2024;
originally announced February 2024.
-
Increasing Entropy to Boost Policy Gradient Performance on Personalization Tasks
Authors:
Andrew Starnes,
Anton Dereventsov,
Clayton Webster
Abstract:
In this effort, we consider the impact of regularization on the diversity of actions taken by policies generated from reinforcement learning agents trained using a policy gradient. Policy gradient agents are prone to entropy collapse, which means certain actions are seldomly, if ever, selected. We augment the optimization objective function for the policy with terms constructed from various…
▽ More
In this effort, we consider the impact of regularization on the diversity of actions taken by policies generated from reinforcement learning agents trained using a policy gradient. Policy gradient agents are prone to entropy collapse, which means certain actions are seldomly, if ever, selected. We augment the optimization objective function for the policy with terms constructed from various $\varphi$-divergences and Maximum Mean Discrepancy which encourages current policies to follow different state visitation and/or action choice distribution than previously computed policies. We provide numerical experiments using MNIST, CIFAR10, and Spotify datasets. The results demonstrate the advantage of diversity-promoting policy regularization and that its use on gradient-based approaches have significantly improved performance on a variety of personalization tasks. Furthermore, numerical evidence is given to show that policy regularization increases performance without losing accuracy.
△ Less
Submitted 8 October, 2023;
originally announced October 2023.
-
Zero-Shot Recommendations with Pre-Trained Large Language Models for Multimodal Nudging
Authors:
Rachel M. Harrison,
Anton Dereventsov,
Anton Bibin
Abstract:
We present a method for zero-shot recommendation of multimodal non-stationary content that leverages recent advancements in the field of generative AI. We propose rendering inputs of different modalities as textual descriptions and to utilize pre-trained LLMs to obtain their numerical representations by computing semantic embeddings. Once unified representations of all content items are obtained,…
▽ More
We present a method for zero-shot recommendation of multimodal non-stationary content that leverages recent advancements in the field of generative AI. We propose rendering inputs of different modalities as textual descriptions and to utilize pre-trained LLMs to obtain their numerical representations by computing semantic embeddings. Once unified representations of all content items are obtained, the recommendation can be performed by computing an appropriate similarity metric between them without any additional learning. We demonstrate our approach on a synthetic multimodal nudging environment, where the inputs consist of tabular, textual, and visual data.
△ Less
Submitted 30 September, 2023; v1 submitted 2 September, 2023;
originally announced September 2023.
-
Modeling Non-deterministic Human Behaviors in Discrete Food Choices
Authors:
Andrew Starnes,
Anton Dereventsov,
E. Susanne Blazek,
Folasade Phillips
Abstract:
We establish a non-deterministic model that predicts a user's food preferences from their demographic information. Our simulator is based on NHANES dataset and domain expert knowledge in the form of established behavioral studies. Our model can be used to generate an arbitrary amount of synthetic datapoints that are similar in distribution to the original dataset and align with behavioral science…
▽ More
We establish a non-deterministic model that predicts a user's food preferences from their demographic information. Our simulator is based on NHANES dataset and domain expert knowledge in the form of established behavioral studies. Our model can be used to generate an arbitrary amount of synthetic datapoints that are similar in distribution to the original dataset and align with behavioral science expectations. Such a simulator can be used in a variety of machine learning tasks and especially in applications requiring human behavior prediction.
△ Less
Submitted 23 January, 2023;
originally announced January 2023.
-
Examining Policy Entropy of Reinforcement Learning Agents for Personalization Tasks
Authors:
Anton Dereventsov,
Andrew Starnes,
Clayton G. Webster
Abstract:
This effort is focused on examining the behavior of reinforcement learning systems in personalization environments and detailing the differences in policy entropy associated with the type of learning algorithm utilized. We demonstrate that Policy Optimization agents often possess low-entropy policies during training, which in practice results in agents prioritizing certain actions and avoiding oth…
▽ More
This effort is focused on examining the behavior of reinforcement learning systems in personalization environments and detailing the differences in policy entropy associated with the type of learning algorithm utilized. We demonstrate that Policy Optimization agents often possess low-entropy policies during training, which in practice results in agents prioritizing certain actions and avoiding others. Conversely, we also show that Q-Learning agents are far less susceptible to such behavior and generally maintain high-entropy policies throughout training, which is often preferable in real-world applications. We provide a wide range of numerical experiments as well as theoretical justification to show that these differences in entropy are due to the type of learning being employed.
△ Less
Submitted 27 April, 2024; v1 submitted 21 November, 2022;
originally announced November 2022.
-
Simulated Contextual Bandits for Personalization Tasks from Recommendation Datasets
Authors:
Anton Dereventsov,
Anton Bibin
Abstract:
We propose a method for generating simulated contextual bandit environments for personalization tasks from recommendation datasets like MovieLens, Netflix, Last.fm, Million Song, etc. This allows for personalization environments to be developed based on real-life data to reflect the nuanced nature of real-world user interactions. The obtained environments can be used to develop methods for solving…
▽ More
We propose a method for generating simulated contextual bandit environments for personalization tasks from recommendation datasets like MovieLens, Netflix, Last.fm, Million Song, etc. This allows for personalization environments to be developed based on real-life data to reflect the nuanced nature of real-world user interactions. The obtained environments can be used to develop methods for solving personalization tasks, algorithm benchmarking, model simulation, and more. We demonstrate our approach with numerical examples on MovieLens and IMDb datasets.
△ Less
Submitted 17 December, 2022; v1 submitted 12 October, 2022;
originally announced October 2022.
-
On the Unreasonable Efficiency of State Space Clustering in Personalization Tasks
Authors:
Anton Dereventsov,
Ranga Raju Vatsavai,
Clayton Webster
Abstract:
In this effort we consider a reinforcement learning (RL) technique for solving personalization tasks with complex reward signals. In particular, our approach is based on state space clustering with the use of a simplistic $k$-means algorithm as well as conventional choices of the network architectures and optimization algorithms. Numerical examples demonstrate the efficiency of different RL proced…
▽ More
In this effort we consider a reinforcement learning (RL) technique for solving personalization tasks with complex reward signals. In particular, our approach is based on state space clustering with the use of a simplistic $k$-means algorithm as well as conventional choices of the network architectures and optimization algorithms. Numerical examples demonstrate the efficiency of different RL procedures and are used to illustrate that this technique accelerates the agent's ability to learn and does not restrict the agent's performance.
△ Less
Submitted 24 December, 2021;
originally announced December 2021.
-
Offline Policy Comparison under Limited Historical Agent-Environment Interactions
Authors:
Anton Dereventsov,
Joseph D. Daws Jr.,
Clayton Webster
Abstract:
We address the challenge of policy evaluation in real-world applications of reinforcement learning systems where the available historical data is limited due to ethical, practical, or security considerations. This constrained distribution of data samples often leads to biased policy evaluation estimates. To remedy this, we propose that instead of policy evaluation, one should perform policy compar…
▽ More
We address the challenge of policy evaluation in real-world applications of reinforcement learning systems where the available historical data is limited due to ethical, practical, or security considerations. This constrained distribution of data samples often leads to biased policy evaluation estimates. To remedy this, we propose that instead of policy evaluation, one should perform policy comparison, i.e. to rank the policies of interest in terms of their value based on available historical data. In addition we present the Limited Data Estimator (LDE) as a simple method for evaluating and comparing policies from a small number of interactions with the environment. According to our theoretical analysis, the LDE is shown to be statistically reliable on policy comparison tasks under mild assumptions on the distribution of the historical data. Additionally, our numerical experiments compare the LDE to other policy evaluation methods on the task of policy ranking and demonstrate its advantage in various settings.
△ Less
Submitted 7 June, 2021;
originally announced June 2021.
-
An adaptive stochastic gradient-free approach for high-dimensional blackbox optimization
Authors:
Anton Dereventsov,
Clayton G. Webster,
Joseph D. Daws Jr
Abstract:
In this work, we propose a novel adaptive stochastic gradient-free (ASGF) approach for solving high-dimensional nonconvex optimization problems based on function evaluations. We employ a directional Gaussian smoothing of the target function that generates a surrogate of the gradient and assists in avoiding bad local optima by utilizing nonlocal information of the loss landscape. Applying a determi…
▽ More
In this work, we propose a novel adaptive stochastic gradient-free (ASGF) approach for solving high-dimensional nonconvex optimization problems based on function evaluations. We employ a directional Gaussian smoothing of the target function that generates a surrogate of the gradient and assists in avoiding bad local optima by utilizing nonlocal information of the loss landscape. Applying a deterministic quadrature scheme results in a massively scalable technique that is sample-efficient and achieves spectral accuracy. At each step we randomly generate the search directions while primarily following the surrogate of the smoothed gradient. This enables exploitation of the gradient direction while maintaining sufficient space exploration, and accelerates convergence towards the global extrema. In addition, we make use of a local approximation of the Lipschitz constant in order to adaptively adjust the values of all hyperparameters, thus removing the careful fine-tuning of current algorithms that is often necessary to be successful when applied to a large class of learning tasks. As such, the ASGF strategy offers significant improvements when solving high-dimensional nonconvex optimization problems when compared to other gradient-free methods (including the so called "evolutionary strategies") as well as iterative approaches that rely on the gradient information of the objective function. We illustrate the improved performance of this method by providing several comparative numerical studies on benchmark global optimization problems and reinforcement learning tasks.
△ Less
Submitted 15 January, 2022; v1 submitted 18 June, 2020;
originally announced June 2020.
-
Neural network integral representations with the ReLU activation function
Authors:
Armenak Petrosyan,
Anton Dereventsov,
Clayton Webster
Abstract:
In this effort, we derive a formula for the integral representation of a shallow neural network with the ReLU activation function. We assume that the outer weighs admit a finite $L_1$-norm with respect to Lebesgue measure on the sphere. For univariate target functions we further provide a closed-form formula for all possible representations. Additionally, in this case our formula allows one to exp…
▽ More
In this effort, we derive a formula for the integral representation of a shallow neural network with the ReLU activation function. We assume that the outer weighs admit a finite $L_1$-norm with respect to Lebesgue measure on the sphere. For univariate target functions we further provide a closed-form formula for all possible representations. Additionally, in this case our formula allows one to explicitly solve the least $L_1$-norm neural network representation for a given function.
△ Less
Submitted 10 June, 2020; v1 submitted 7 October, 2019;
originally announced October 2019.
-
Greedy Shallow Networks: An Approach for Constructing and Training Neural Networks
Authors:
Anton Dereventsov,
Armenak Petrosyan,
Clayton Webster
Abstract:
We present a greedy-based approach to construct an efficient single hidden layer neural network with the ReLU activation that approximates a target function. In our approach we obtain a shallow network by utilizing a greedy algorithm with the prescribed dictionary provided by the available training data and a set of possible inner weights. To facilitate the greedy selection process we employ an in…
▽ More
We present a greedy-based approach to construct an efficient single hidden layer neural network with the ReLU activation that approximates a target function. In our approach we obtain a shallow network by utilizing a greedy algorithm with the prescribed dictionary provided by the available training data and a set of possible inner weights. To facilitate the greedy selection process we employ an integral representation of the network, based on the ridgelet transform, that significantly reduces the cardinality of the dictionary and hence promotes feasibility of the greedy selection. Our approach allows for the construction of efficient architectures which can be treated either as improved initializations to be used in place of random-based alternatives, or as fully-trained networks in certain cases, thus potentially nullifying the need for backpropagation training. Numerical experiments demonstrate the tenability of the proposed concept and its advantages compared to the conventional techniques for selecting architectures and initializations for neural networks.
△ Less
Submitted 30 September, 2021; v1 submitted 24 May, 2019;
originally announced May 2019.