Skip to main content

Showing 1–11 of 11 results for author: Dereventsov, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.14837  [pdf, other

    cs.CL cs.AI cs.HC cs.LG

    An Empirical Categorization of Prompting Techniques for Large Language Models: A Practitioner's Guide

    Authors: Oluwole Fagbohun, Rachel M. Harrison, Anton Dereventsov

    Abstract: Due to rapid advancements in the development of Large Language Models (LLMs), programming these models with prompts has recently gained significant attention. However, the sheer number of available prompt engineering techniques creates an overwhelming landscape for practitioners looking to utilize these tools. For the most efficient and effective use of LLMs, it is important to compile a comprehen… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

  2. arXiv:2310.05324  [pdf, other

    cs.LG

    Increasing Entropy to Boost Policy Gradient Performance on Personalization Tasks

    Authors: Andrew Starnes, Anton Dereventsov, Clayton Webster

    Abstract: In this effort, we consider the impact of regularization on the diversity of actions taken by policies generated from reinforcement learning agents trained using a policy gradient. Policy gradient agents are prone to entropy collapse, which means certain actions are seldomly, if ever, selected. We augment the optimization objective function for the policy with terms constructed from various… ▽ More

    Submitted 8 October, 2023; originally announced October 2023.

    Comments: 8 pages, 3 figures, accepted to WAIN 2023

  3. arXiv:2309.01026  [pdf, other

    cs.AI cs.CL cs.IR cs.LG cs.MM

    Zero-Shot Recommendations with Pre-Trained Large Language Models for Multimodal Nudging

    Authors: Rachel M. Harrison, Anton Dereventsov, Anton Bibin

    Abstract: We present a method for zero-shot recommendation of multimodal non-stationary content that leverages recent advancements in the field of generative AI. We propose rendering inputs of different modalities as textual descriptions and to utilize pre-trained LLMs to obtain their numerical representations by computing semantic embeddings. Once unified representations of all content items are obtained,… ▽ More

    Submitted 30 September, 2023; v1 submitted 2 September, 2023; originally announced September 2023.

  4. arXiv:2301.09454  [pdf, other

    stat.ML cs.LG stat.AP

    Modeling Non-deterministic Human Behaviors in Discrete Food Choices

    Authors: Andrew Starnes, Anton Dereventsov, E. Susanne Blazek, Folasade Phillips

    Abstract: We establish a non-deterministic model that predicts a user's food preferences from their demographic information. Our simulator is based on NHANES dataset and domain expert knowledge in the form of established behavioral studies. Our model can be used to generate an arbitrary amount of synthetic datapoints that are similar in distribution to the original dataset and align with behavioral science… ▽ More

    Submitted 23 January, 2023; originally announced January 2023.

    Comments: 6 pages, 4 figures, published in 2022 IEEE International Conference on Data Mining Workshops (ICDMW)

  5. arXiv:2211.11869  [pdf, other

    cs.LG cs.AI math.NA math.OC

    Examining Policy Entropy of Reinforcement Learning Agents for Personalization Tasks

    Authors: Anton Dereventsov, Andrew Starnes, Clayton G. Webster

    Abstract: This effort is focused on examining the behavior of reinforcement learning systems in personalization environments and detailing the differences in policy entropy associated with the type of learning algorithm utilized. We demonstrate that Policy Optimization agents often possess low-entropy policies during training, which in practice results in agents prioritizing certain actions and avoiding oth… ▽ More

    Submitted 27 April, 2024; v1 submitted 21 November, 2022; originally announced November 2022.

  6. arXiv:2210.10631  [pdf, other

    cs.IR cs.AI cs.LG

    Simulated Contextual Bandits for Personalization Tasks from Recommendation Datasets

    Authors: Anton Dereventsov, Anton Bibin

    Abstract: We propose a method for generating simulated contextual bandit environments for personalization tasks from recommendation datasets like MovieLens, Netflix, Last.fm, Million Song, etc. This allows for personalization environments to be developed based on real-life data to reflect the nuanced nature of real-world user interactions. The obtained environments can be used to develop methods for solving… ▽ More

    Submitted 17 December, 2022; v1 submitted 12 October, 2022; originally announced October 2022.

  7. arXiv:2112.13141  [pdf, other

    cs.LG cs.AI math.NA

    On the Unreasonable Efficiency of State Space Clustering in Personalization Tasks

    Authors: Anton Dereventsov, Ranga Raju Vatsavai, Clayton Webster

    Abstract: In this effort we consider a reinforcement learning (RL) technique for solving personalization tasks with complex reward signals. In particular, our approach is based on state space clustering with the use of a simplistic $k$-means algorithm as well as conventional choices of the network architectures and optimization algorithms. Numerical examples demonstrate the efficiency of different RL proced… ▽ More

    Submitted 24 December, 2021; originally announced December 2021.

  8. arXiv:2106.03934  [pdf, other

    cs.LG cs.AI

    Offline Policy Comparison under Limited Historical Agent-Environment Interactions

    Authors: Anton Dereventsov, Joseph D. Daws Jr., Clayton Webster

    Abstract: We address the challenge of policy evaluation in real-world applications of reinforcement learning systems where the available historical data is limited due to ethical, practical, or security considerations. This constrained distribution of data samples often leads to biased policy evaluation estimates. To remedy this, we propose that instead of policy evaluation, one should perform policy compar… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

  9. arXiv:2006.10887  [pdf, other

    math.OC cs.LG

    An adaptive stochastic gradient-free approach for high-dimensional blackbox optimization

    Authors: Anton Dereventsov, Clayton G. Webster, Joseph D. Daws Jr

    Abstract: In this work, we propose a novel adaptive stochastic gradient-free (ASGF) approach for solving high-dimensional nonconvex optimization problems based on function evaluations. We employ a directional Gaussian smoothing of the target function that generates a surrogate of the gradient and assists in avoiding bad local optima by utilizing nonlocal information of the loss landscape. Applying a determi… ▽ More

    Submitted 15 January, 2022; v1 submitted 18 June, 2020; originally announced June 2020.

  10. arXiv:1910.02743  [pdf, ps, other

    cs.LG stat.ML

    Neural network integral representations with the ReLU activation function

    Authors: Armenak Petrosyan, Anton Dereventsov, Clayton Webster

    Abstract: In this effort, we derive a formula for the integral representation of a shallow neural network with the ReLU activation function. We assume that the outer weighs admit a finite $L_1$-norm with respect to Lebesgue measure on the sphere. For univariate target functions we further provide a closed-form formula for all possible representations. Additionally, in this case our formula allows one to exp… ▽ More

    Submitted 10 June, 2020; v1 submitted 7 October, 2019; originally announced October 2019.

  11. arXiv:1905.10409  [pdf, other

    cs.LG stat.ML

    Greedy Shallow Networks: An Approach for Constructing and Training Neural Networks

    Authors: Anton Dereventsov, Armenak Petrosyan, Clayton Webster

    Abstract: We present a greedy-based approach to construct an efficient single hidden layer neural network with the ReLU activation that approximates a target function. In our approach we obtain a shallow network by utilizing a greedy algorithm with the prescribed dictionary provided by the available training data and a set of possible inner weights. To facilitate the greedy selection process we employ an in… ▽ More

    Submitted 30 September, 2021; v1 submitted 24 May, 2019; originally announced May 2019.