-
Diversifying by Intent in Recommender Systems
Authors:
Yuyan Wang,
Cheenar Banerjee,
Samer Chucri,
Fabio Soldo,
Sriraj Badam,
Ed H. Chi,
Minmin Chen
Abstract:
It has become increasingly clear that recommender systems overly focusing on short-term engagement can inadvertently hurt long-term user experience. However, it is challenging to optimize long-term user experience directly as the desired signal is sparse, noisy and manifests over a long horizon. In this work, we show the benefits of incorporating higher-level user understanding, specifically user…
▽ More
It has become increasingly clear that recommender systems overly focusing on short-term engagement can inadvertently hurt long-term user experience. However, it is challenging to optimize long-term user experience directly as the desired signal is sparse, noisy and manifests over a long horizon. In this work, we show the benefits of incorporating higher-level user understanding, specifically user intents that can persist across multiple interactions or recommendation sessions, for whole-page recommendation toward optimizing long-term user experience. User intent has primarily been investigated within the context of search, but remains largely under-explored for recommender systems. To bridge this gap, we develop a probabilistic intent-based whole-page diversification framework in the final stage of a recommender system. Starting with a prior belief of user intents, the proposed diversification framework sequentially selects items at each position based on these beliefs, and subsequently updates posterior beliefs about the intents. It ensures that different user intents are represented in a page towards optimizing long-term user experience.
We experiment with the intent diversification framework on one of the world's largest content recommendation platforms, serving billions of users daily. Our framework incorporates the user's exploration intent, capturing their propensity to explore new interests and content. Live experiments show that the proposed framework leads to an increase in user retention and overall user enjoyment, validating its effectiveness in facilitating long-term planning. In particular, it enables users to consistently discover and engage with diverse contents that align with their underlying intents over time, thereby leading to an improved long-term user experience.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
Static and Dynamic Synthesis of Bengali and Devanagari Signatures
Authors:
Miguel A. Ferrer,
Sukalpa Chanda,
Moises Diaz,
Chayan Kr. Banerjee,
Anirban Majumdar,
Cristina Carmona-Duarte,
Parikshit Acharya,
Umapada Pal
Abstract:
Develo** an automatic signature verification system is challenging and demands a large number of training samples. This is why synthetic handwriting generation is an emerging topic in document image analysis. Some handwriting synthesizers use the motor equivalence model, the well-established hypothesis from neuroscience, which analyses how a human being accomplishes movement. Specifically, a mot…
▽ More
Develo** an automatic signature verification system is challenging and demands a large number of training samples. This is why synthetic handwriting generation is an emerging topic in document image analysis. Some handwriting synthesizers use the motor equivalence model, the well-established hypothesis from neuroscience, which analyses how a human being accomplishes movement. Specifically, a motor equivalence model divides human actions into two steps: 1) the effector independent step at cognitive level and 2) the effector dependent step at motor level. In fact, recent work reports the successful application to Western scripts of a handwriting synthesizer, based on this theory. This paper aims to adapt this scheme for the generation of synthetic signatures in two Indic scripts, Bengali (Bangla), and Devanagari (Hindi). For this purpose, we use two different online and offline databases for both Bengali and Devanagari signatures. This paper reports an effective synthesizer for static and dynamic signatures written in Devanagari or Bengali scripts. We obtain promising results with artificially generated signatures in terms of appearance and performance when we compare the results with those for real signatures.
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
A Survey on Physics Informed Reinforcement Learning: Review and Open Problems
Authors:
Chayan Banerjee,
Kien Nguyen,
Clinton Fookes,
Maziar Raissi
Abstract:
The inclusion of physical information in machine learning frameworks has revolutionized many application areas. This involves enhancing the learning process by incorporating physical constraints and adhering to physical laws. In this work we explore their utility for reinforcement learning applications. We present a thorough review of the literature on incorporating physics information, as known a…
▽ More
The inclusion of physical information in machine learning frameworks has revolutionized many application areas. This involves enhancing the learning process by incorporating physical constraints and adhering to physical laws. In this work we explore their utility for reinforcement learning applications. We present a thorough review of the literature on incorporating physics information, as known as physics priors, in reinforcement learning approaches, commonly referred to as physics-informed reinforcement learning (PIRL). We introduce a novel taxonomy with the reinforcement learning pipeline as the backbone to classify existing works, compare and contrast them, and derive crucial insights. Existing works are analyzed with regard to the representation/ form of the governing physics modeled for integration, their specific contribution to the typical reinforcement learning architecture, and their connection to the underlying reinforcement learning pipeline stages. We also identify core learning architectures and physics incorporation biases (i.e., observational, inductive and learning) of existing PIRL approaches and use them to further categorize the works for better understanding and adaptation. By providing a comprehensive perspective on the implementation of the physics-informed capability, the taxonomy presents a cohesive approach to PIRL. It identifies the areas where this approach has been applied, as well as the gaps and opportunities that exist. Additionally, the taxonomy sheds light on unresolved issues and challenges, which can guide future research. This nascent field holds great potential for enhancing reinforcement learning algorithms by increasing their physical plausibility, precision, data efficiency, and applicability in real-world scenarios.
△ Less
Submitted 4 September, 2023;
originally announced September 2023.
-
Physics-Informed Computer Vision: A Review and Perspectives
Authors:
Chayan Banerjee,
Kien Nguyen,
Clinton Fookes,
George Karniadakis
Abstract:
The incorporation of physical information in machine learning frameworks is opening and transforming many application domains. Here the learning process is augmented through the induction of fundamental knowledge and governing physical laws. In this work, we explore their utility for computer vision tasks in interpreting and understanding visual data. We present a systematic literature review of m…
▽ More
The incorporation of physical information in machine learning frameworks is opening and transforming many application domains. Here the learning process is augmented through the induction of fundamental knowledge and governing physical laws. In this work, we explore their utility for computer vision tasks in interpreting and understanding visual data. We present a systematic literature review of more than 250 papers on formulation and approaches to computer vision tasks guided by physical laws. We begin by decomposing the popular computer vision pipeline into a taxonomy of stages and investigate approaches to incorporate governing physical equations in each stage. Existing approaches in computer vision tasks are analyzed with regard to what governing physical processes are modeled and formulated, and how they are incorporated, i.e. modification of input data (observation bias), modification of network architectures (inductive bias), and modification of training losses (learning bias). The taxonomy offers a unified view of the application of the physics-informed capability, highlighting where physics-informed learning has been conducted and where the gaps and opportunities are. Finally, we highlight open problems and challenges to inform future research. While still in its early days, the study of physics-informed computer vision has the promise to develop better computer vision models that can improve physical plausibility, accuracy, data efficiency, and generalization in increasingly realistic applications.
△ Less
Submitted 12 May, 2024; v1 submitted 29 May, 2023;
originally announced May 2023.
-
Boosting Exploration in Actor-Critic Algorithms by Incentivizing Plausible Novel States
Authors:
Chayan Banerjee,
Zhiyong Chen,
Nasimul Noman
Abstract:
Actor-critic (AC) algorithms are a class of model-free deep reinforcement learning algorithms, which have proven their efficacy in diverse domains, especially in solving continuous control problems. Improvement of exploration (action entropy) and exploitation (expected return) using more efficient samples is a critical issue in AC algorithms. A basic strategy of a learning algorithm is to facilita…
▽ More
Actor-critic (AC) algorithms are a class of model-free deep reinforcement learning algorithms, which have proven their efficacy in diverse domains, especially in solving continuous control problems. Improvement of exploration (action entropy) and exploitation (expected return) using more efficient samples is a critical issue in AC algorithms. A basic strategy of a learning algorithm is to facilitate indiscriminately exploring all of the environment state space, as well as to encourage exploring rarely visited states rather than frequently visited one. Under this strategy, we propose a new method to boost exploration through an intrinsic reward, based on measurement of a state's novelty and the associated benefit of exploring the state (with regards to policy optimization), altogether called plausible novelty. With incentivized exploration of plausible novel states, an AC algorithm is able to improve its sample efficiency and hence training performance. The new method is verified by extensive simulations of continuous control tasks of MuJoCo environments on a variety of prominent off-policy AC algorithms.
△ Less
Submitted 1 October, 2022;
originally announced October 2022.
-
Improved Soft Actor-Critic: Mixing Prioritized Off-Policy Samples with On-Policy Experience
Authors:
Chayan Banerjee,
Zhiyong Chen,
Nasimul Noman
Abstract:
Soft Actor-Critic (SAC) is an off-policy actor-critic reinforcement learning algorithm, essentially based on entropy regularization. SAC trains a policy by maximizing the trade-off between expected return and entropy (randomness in the policy). It has achieved state-of-the-art performance on a range of continuous-control benchmark tasks, outperforming prior on-policy and off-policy methods. SAC wo…
▽ More
Soft Actor-Critic (SAC) is an off-policy actor-critic reinforcement learning algorithm, essentially based on entropy regularization. SAC trains a policy by maximizing the trade-off between expected return and entropy (randomness in the policy). It has achieved state-of-the-art performance on a range of continuous-control benchmark tasks, outperforming prior on-policy and off-policy methods. SAC works in an off-policy fashion where data are sampled uniformly from past experiences (stored in a buffer) using which parameters of the policy and value function networks are updated. We propose certain crucial modifications for boosting the performance of SAC and make it more sample efficient. In our proposed improved SAC, we firstly introduce a new prioritization scheme for selecting better samples from the experience replay buffer. Secondly we use a mixture of the prioritized off-policy data with the latest on-policy data for training the policy and the value function networks. We compare our approach with the vanilla SAC and some recent variants of SAC and show that our approach outperforms the said algorithmic benchmarks. It is comparatively more stable and sample efficient when tested on a number of continuous control tasks in MuJoCo environments.
△ Less
Submitted 24 September, 2021;
originally announced September 2021.
-
Optimal Actor-Critic Policy with Optimized Training Datasets
Authors:
Chayan Banerjee,
Zhiyong Chen,
Nasimul Noman,
Mohsen Zamani
Abstract:
Actor-critic (AC) algorithms are known for their efficacy and high performance in solving reinforcement learning problems, but they also suffer from low sampling efficiency. An AC based policy optimization process is iterative and needs to frequently access the agent-environment system to evaluate and update the policy by rolling out the policy, collecting rewards and states (i.e. samples), and le…
▽ More
Actor-critic (AC) algorithms are known for their efficacy and high performance in solving reinforcement learning problems, but they also suffer from low sampling efficiency. An AC based policy optimization process is iterative and needs to frequently access the agent-environment system to evaluate and update the policy by rolling out the policy, collecting rewards and states (i.e. samples), and learning from them. It ultimately requires a huge number of samples to learn an optimal policy. To improve sampling efficiency, we propose a strategy to optimize the training dataset that contains significantly less samples collected from the AC process. The dataset optimization is made of a best episode only operation, a policy parameter-fitness model, and a genetic algorithm module. The optimal policy network trained by the optimized training dataset exhibits superior performance compared to many contemporary AC algorithms in controlling autonomous dynamical systems. Evaluation on standard benchmarks show that the method improves sampling efficiency, ensures faster convergence to optima, and is more data-efficient than its counterparts.
△ Less
Submitted 1 December, 2021; v1 submitted 16 August, 2021;
originally announced August 2021.
-
Software Security Rules, SDLC Perspective
Authors:
C. Banerjee,
S. K. Pandey
Abstract:
Software has become an integral part of everyday life. Everyday, millions of people perform transaction through internet, ATM, mobile phone, they send email and Egreetings, and use word processing and spreadsheet for various purpose. People use software bearing in mind that it is reliable and can be trust upon and the operation they perform is secured. Now, if these software have exploitable sec…
▽ More
Software has become an integral part of everyday life. Everyday, millions of people perform transaction through internet, ATM, mobile phone, they send email and Egreetings, and use word processing and spreadsheet for various purpose. People use software bearing in mind that it is reliable and can be trust upon and the operation they perform is secured. Now, if these software have exploitable security hole then how can they be safe for use. Security brings value to software in terms of peoples trust. The value provided by secure software is of vital importance because many critical functions are entirely dependent on the software. That is why security is a serious topic which should be given proper attention during the entire SDLC, right from the beginning. For the proper implementation of security in the software, twenty one security rules are proposed in this paper along with validation results. It is found that by applying these rules as per given implementation mechanism, most of the vulnerabilities are eliminated in the software and a more secure software can be built.
△ Less
Submitted 2 November, 2009;
originally announced November 2009.