Search | arXiv e-print repository

doi 10.1145/3631700.3664869

Beyond Static Calibration: The Impact of User Preference Dynamics on Calibrated Recommendation

Authors: Kun Lin, Masoud Mansoury, Farzad Eskandanian, Milad Sabouri, Bamshad Mobasher

Abstract: Calibration in recommender systems is an important performance criterion that ensures consistency between the distribution of user preference categories and that of recommendations generated by the system. Standard methods for mitigating miscalibration typically assume that user preference profiles are static, and they measure calibration relative to the full history of user's interactions, includ… ▽ More Calibration in recommender systems is an important performance criterion that ensures consistency between the distribution of user preference categories and that of recommendations generated by the system. Standard methods for mitigating miscalibration typically assume that user preference profiles are static, and they measure calibration relative to the full history of user's interactions, including possibly outdated and stale preference categories. We conjecture that this approach can lead to recommendations that, while appearing calibrated, in fact, distort users' true preferences. In this paper, we conduct a preliminary investigation of recommendation calibration at a more granular level, taking into account evolving user preferences. By analyzing differently sized training time windows from the most recent interactions to the oldest, we identify the most relevant segment of user's preferences that optimizes the calibration metric. We perform an exploratory analysis with datasets from different domains with distinctive user-interaction characteristics. We demonstrate how the evolving nature of user preferences affects recommendation calibration, and how this effect is manifested differently depending on the characteristics of the data in a given domain. Datasets, codes, and more detailed experimental results are available at: https://github.com/nicolelin13/DynamicCalibrationUMAP. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: 8 pages, 4 figures, accepted as LBR paper at UMAP '24 -- ACM Conference on User Modeling, Adaptation and Personalization 2024

MSC Class: 68-06 ACM Class: H.3.4

arXiv:2309.02322 [pdf, other]

Fairness of Exposure in Dynamic Recommendation

Authors: Masoud Mansoury, Bamshad Mobasher

Abstract: Exposure bias is a well-known issue in recommender systems where the exposure is not fairly distributed among items in the recommendation results. This is especially problematic when bias is amplified over time as a few items (e.g., popular ones) are repeatedly over-represented in recommendation lists and users' interactions with those items will amplify bias towards those items over time resultin… ▽ More Exposure bias is a well-known issue in recommender systems where the exposure is not fairly distributed among items in the recommendation results. This is especially problematic when bias is amplified over time as a few items (e.g., popular ones) are repeatedly over-represented in recommendation lists and users' interactions with those items will amplify bias towards those items over time resulting in a feedback loop. This issue has been extensively studied in the literature in static recommendation environment where a single round of recommendation result is processed to improve the exposure fairness. However, less work has been done on addressing exposure bias in a dynamic recommendation setting where the system is operating over time, the recommendation model and the input data are dynamically updated with ongoing user feedback on recommended items at each round. In this paper, we study exposure bias in a dynamic recommendation setting. Our goal is to show that existing bias mitigation methods that are designed to operate in a static recommendation setting are unable to satisfy fairness of exposure for items in long run. In particular, we empirically study one of these methods and show that repeatedly applying this method fails to fairly distribute exposure among items in long run. To address this limitation, we show how this method can be adapted to effectively operate in a dynamic recommendation setting and achieve exposure fairness for items in long run. Experiments on a real-world dataset confirm that our solution is superior in achieving long-term exposure fairness for the items while maintaining the recommendation accuracy. △ Less

Submitted 5 September, 2023; originally announced September 2023.

arXiv:2209.01665 [pdf, ps, other]

Exposure-Aware Recommendation using Contextual Bandits

Authors: Masoud Mansoury, Bamshad Mobasher, Herke van Hoof

Abstract: Exposure bias is a well-known issue in recommender systems where items and suppliers are not equally represented in the recommendation results. This is especially problematic when bias is amplified over time as a few items (e.g., popular ones) are repeatedly over-represented in recommendation lists and users' interactions with those items will amplify bias towards those items over time resulting i… ▽ More Exposure bias is a well-known issue in recommender systems where items and suppliers are not equally represented in the recommendation results. This is especially problematic when bias is amplified over time as a few items (e.g., popular ones) are repeatedly over-represented in recommendation lists and users' interactions with those items will amplify bias towards those items over time resulting in a feedback loop. This issue has been extensively studied in the literature on model-based or neighborhood-based recommendation algorithms, but less work has been done on online recommendation models, such as those based on top-K contextual bandits, where recommendation models are dynamically updated with ongoing user feedback. In this paper, we study exposure bias in a class of well-known contextual bandit algorithms known as Linear Cascading Bandits. We analyze these algorithms on their ability to handle exposure bias and provide a fair representation for items in the recommendation results. Our analysis reveals that these algorithms tend to amplify exposure disparity among items over time. In particular, we observe that these algorithms do not properly adapt to the feedback provided by the users and frequently recommend certain items even when those items are not selected by users. To mitigate this bias, we propose an Exposure-Aware (EA) reward model that updates the model parameters based on two factors: 1) user feedback (i.e., clicked or not), and 2) position of the item in the recommendation list. This way, the proposed model controls the utility assigned to items based on their exposure in the recommendation list. Extensive experiments on two real-world datasets using three contextual bandit algorithms show that the proposed reward model reduces exposure bias amplification in long run while maintaining the recommendation accuracy. △ Less

Submitted 4 September, 2022; originally announced September 2022.

arXiv:2207.00528 [pdf, ps, other]

Behavioral Player Rating in Competitive Online Shooter Games

Authors: Arman Dehpanah, Muheeb Faizan Ghori, Jonathan Gemmell, Bamshad Mobasher

Abstract: Competitive online games use rating systems for matchmaking; progression-based algorithms that estimate the skill level of players with interpretable ratings in terms of the outcome of the games they played. However, the overall experience of players is shaped by factors beyond the sole outcome of their games. In this paper, we engineer several features from in-game statistics to model players and… ▽ More Competitive online games use rating systems for matchmaking; progression-based algorithms that estimate the skill level of players with interpretable ratings in terms of the outcome of the games they played. However, the overall experience of players is shaped by factors beyond the sole outcome of their games. In this paper, we engineer several features from in-game statistics to model players and create ratings that accurately represent their behavior and true performance level. We then compare the estimating power of our behavioral ratings against ratings created with three mainstream rating systems by predicting rank of players in four popular game modes from the competitive shooter genre. Our results show that the behavioral ratings present more accurate performance estimations while maintaining the interpretability of the created representations. Considering different aspects of the playing behavior of players and using behavioral ratings for matchmaking can lead to match-ups that are more aligned with players' goals and interests, consequently resulting in a more enjoyable gaming experience. △ Less

Submitted 1 July, 2022; originally announced July 2022.

Comments: Accepted in The 20th International Conference on Scientific Computing (CSC'22)

arXiv:2205.12408 [pdf, other]

doi 10.1145/3511047.3537657

Using user's local context to support local news

Authors: Payam Pourashraf, Bamshad Mobasher

Abstract: American local newspapers have been experiencing a large loss of reader retention and business within the past 15 years due to the proliferation of online news sources. Local media companies are starting to shift from an advertising-supported business model to one based on subscriptions to mitigate this problem. With this subscription model, there is a need to increase user engagement and personal… ▽ More American local newspapers have been experiencing a large loss of reader retention and business within the past 15 years due to the proliferation of online news sources. Local media companies are starting to shift from an advertising-supported business model to one based on subscriptions to mitigate this problem. With this subscription model, there is a need to increase user engagement and personalization, and recommender systems are one way for these news companies to accomplish this goal. However, using standard modeling approaches that focus on users' global preferences is not appropriate in this context because the local preferences of users exhibit some specific characteristics which do not necessarily match their long-term or global preferences in the news. Our research explores a localized session-based recommendation approach, using recommendations based on local news articles and articles pertaining to the different local news categories. Experiments performed on a news dataset from a local newspaper show that these local models, particularly certain categories of items, do indeed provide more accuracy and effectiveness for personalization which, in turn, may lead to more user engagement with local news content. △ Less

Submitted 25 May, 2022; v1 submitted 24 May, 2022; originally announced May 2022.

arXiv:2112.04379 [pdf, other]

Player Modeling using Behavioral Signals in Competitive Online Games

Authors: Arman Dehpanah, Muheeb Faizan Ghori, Jonathan Gemmell, Bamshad Mobasher

Abstract: Competitive online games use rating systems to match players with similar skills to ensure a satisfying experience for players. In this paper, we focus on the importance of addressing different aspects of playing behavior when modeling players for creating match-ups. To this end, we engineer several behavioral features from a dataset of over 75,000 battle royale matches and create player models ba… ▽ More Competitive online games use rating systems to match players with similar skills to ensure a satisfying experience for players. In this paper, we focus on the importance of addressing different aspects of playing behavior when modeling players for creating match-ups. To this end, we engineer several behavioral features from a dataset of over 75,000 battle royale matches and create player models based on the retrieved features. We then use the created models to predict ranks for different groups of players in the data. The predicted ranks are compared to those of three popular rating systems. Our results show the superiority of simple behavioral models over mainstream rating systems. Some behavioral features provided accurate predictions for all groups of players while others proved useful for certain groups of players. The results of this study highlight the necessity of considering different aspects of the player's behavior such as goals, strategy, and expertise when making assignments. △ Less

Submitted 29 November, 2021; originally announced December 2021.

Comments: Accepted in the 2021 International Conference on Computational Science and Computational Intelligence (CSCI'21)

arXiv:2109.00982 [pdf, other]

How does the User's Knowledge of the Recommender Influence their Behavior?

Authors: Muheeb Faizan Ghori, Arman Dehpanah, Jonathan Gemmell, Hamed Qahri-Saremi, Bamshad Mobasher

Abstract: Recommender systems have become a ubiquitous part of modern web applications. They help users discover new and relevant items. Today's users, through years of interaction with these systems have developed an inherent understanding of how recommender systems function, what their objectives are, and how the user might manipulate them. We describe this understanding as the Theory of the Recommender.… ▽ More Recommender systems have become a ubiquitous part of modern web applications. They help users discover new and relevant items. Today's users, through years of interaction with these systems have developed an inherent understanding of how recommender systems function, what their objectives are, and how the user might manipulate them. We describe this understanding as the Theory of the Recommender. In this study, we conducted semi-structured interviews with forty recommender system users to empirically explore the relevant factors influencing user behavior. Our findings, based on a rigorous thematic analysis of the collected data, suggest that users possess an intuitive and sophisticated understanding of the recommender system's behavior. We also found that users, based upon their understanding, attitude, and intentions change their interactions to evoke desired recommender behavior. Finally, we discuss the potential implications of such user behavior on recommendation performance. △ Less

Submitted 2 September, 2021; originally announced September 2021.

Comments: IntRS'21@RecSys: Joint Workshop on Interfaces and Human Decision Making for Recommender Systems, September 25, 2021, Virtual Event

arXiv:2108.03440 [pdf, other]

Unbiased Cascade Bandits: Mitigating Exposure Bias in Online Learning to Rank Recommendation

Authors: Masoud Mansoury, Himan Abdollahpouri, Bamshad Mobasher, Mykola Pechenizkiy, Robin Burke, Milad Sabouri

Abstract: Exposure bias is a well-known issue in recommender systems where items and suppliers are not equally represented in the recommendation results. This is especially problematic when bias is amplified over time as a few popular items are repeatedly over-represented in recommendation lists. This phenomenon can be viewed as a recommendation feedback loop: the system repeatedly recommends certain items… ▽ More Exposure bias is a well-known issue in recommender systems where items and suppliers are not equally represented in the recommendation results. This is especially problematic when bias is amplified over time as a few popular items are repeatedly over-represented in recommendation lists. This phenomenon can be viewed as a recommendation feedback loop: the system repeatedly recommends certain items at different time points and interactions of users with those items will amplify bias towards those items over time. This issue has been extensively studied in the literature on model-based or neighborhood-based recommendation algorithms, but less work has been done on online recommendation models such as those based on multi-armed Bandit algorithms. In this paper, we study exposure bias in a class of well-known bandit algorithms known as Linear Cascade Bandits. We analyze these algorithms on their ability to handle exposure bias and provide a fair representation for items and suppliers in the recommendation results. Our analysis reveals that these algorithms fail to treat items and suppliers fairly and do not sufficiently explore the item space for each user. To mitigate this bias, we propose a discounting factor and incorporate it into these algorithms that controls the exposure of items at each time step. To show the effectiveness of the proposed discounting factor on mitigating exposure bias, we perform experiments on two datasets using three cascading bandit algorithms and our experimental results show that the proposed method improves the exposure fairness for items and suppliers. △ Less

Submitted 7 August, 2021; originally announced August 2021.

arXiv:2107.03415 [pdf, other]

doi 10.1145/3470948

A Graph-based Approach for Mitigating Multi-sided Exposure Bias in Recommender Systems

Authors: Masoud Mansoury, Himan Abdollahpouri, Mykola Pechenizkiy, Bamshad Mobasher, Robin Burke

Abstract: Fairness is a critical system-level objective in recommender systems that has been the subject of extensive recent research. A specific form of fairness is supplier exposure fairness where the objective is to ensure equitable coverage of items across all suppliers in recommendations provided to users. This is especially important in multistakeholder recommendation scenarios where it may be importa… ▽ More Fairness is a critical system-level objective in recommender systems that has been the subject of extensive recent research. A specific form of fairness is supplier exposure fairness where the objective is to ensure equitable coverage of items across all suppliers in recommendations provided to users. This is especially important in multistakeholder recommendation scenarios where it may be important to optimize utilities not just for the end-user, but also for other stakeholders such as item sellers or producers who desire a fair representation of their items. This type of supplier fairness is sometimes accomplished by attempting to increasing aggregate diversity in order to mitigate popularity bias and to improve the coverage of long-tail items in recommendations. In this paper, we introduce FairMatch, a general graph-based algorithm that works as a post processing approach after recommendation generation to improve exposure fairness for items and suppliers. The algorithm iteratively adds high quality items that have low visibility or items from suppliers with low exposure to the users' final recommendation lists. A comprehensive set of experiments on two datasets and comparison with state-of-the-art baselines show that FairMatch, while significantly improves exposure fairness and aggregate diversity, maintains an acceptable level of relevance of the recommendations. △ Less

Submitted 7 July, 2021; originally announced July 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:2005.01148

arXiv:2106.11397 [pdf, other]

Evaluating Team Skill Aggregation in Online Competitive Games

Authors: Arman Dehpanah, Muheeb Faizan Ghori, Jonathan Gemmell, Bamshad Mobasher

Abstract: One of the main goals of online competitive games is increasing player engagement by ensuring fair matches. These games use rating systems for creating balanced match-ups. Rating systems leverage statistical estimation to rate players' skills and use skill ratings to predict rank before matching players. Skill ratings of individual players can be aggregated to compute the skill level of a team. Wh… ▽ More One of the main goals of online competitive games is increasing player engagement by ensuring fair matches. These games use rating systems for creating balanced match-ups. Rating systems leverage statistical estimation to rate players' skills and use skill ratings to predict rank before matching players. Skill ratings of individual players can be aggregated to compute the skill level of a team. While research often aims to improve the accuracy of skill estimation and fairness of match-ups, less attention has been given to how the skill level of a team is calculated from the skill level of its members. In this paper, we propose two new aggregation methods and compare them with a standard approach extensively used in the research literature. We present an exhaustive analysis of the impact of these methods on the predictive performance of rating systems. We perform our experiments using three popular rating systems, Elo, Glicko, and TrueSkill, on three real-world datasets including over 100,000 battle royale and head-to-head matches. Our evaluations show the superiority of the MAX method over the other two methods in the majority of the tested cases, implying that the overall performance of a team is best determined by the performance of its most skilled member. The results of this study highlight the necessity of devising more elaborated methods for calculating a team's performance -- methods covering different aspects of players' behavior such as skills, strategy, or goals. △ Less

Submitted 21 June, 2021; originally announced June 2021.

Comments: Accepted in IEEE Conference on Games 2021

arXiv:2105.14069 [pdf, other]

The Evaluation of Rating Systems in Team-based Battle Royale Games

Authors: Arman Dehpanah, Muheeb Faizan Ghori, Jonathan Gemmell, Bamshad Mobasher

Abstract: Online competitive games have become a mainstream entertainment platform. To create a fair and exciting experience, these games use rating systems to match players with similar skills. While there has been an increasing amount of research on improving the performance of these systems, less attention has been paid to how their performance is evaluated. In this paper, we explore the utility of sever… ▽ More Online competitive games have become a mainstream entertainment platform. To create a fair and exciting experience, these games use rating systems to match players with similar skills. While there has been an increasing amount of research on improving the performance of these systems, less attention has been paid to how their performance is evaluated. In this paper, we explore the utility of several metrics for evaluating three popular rating systems on a real-world dataset of over 25,000 team battle royale matches. Our results suggest considerable differences in their evaluation patterns. Some metrics were highly impacted by the inclusion of new players. Many could not capture the real differences between certain groups of players. Among all metrics studied, normalized discounted cumulative gain (NDCG) demonstrated more reliable performance and more flexibility. It alleviated most of the challenges faced by the other metrics while adding the freedom to adjust the focus of the evaluations on different groups of players. △ Less

Submitted 29 June, 2021; v1 submitted 28 May, 2021; originally announced May 2021.

Comments: Updated references -- 10 pages, 1 figure, Accepted in the 23rd International Conference on Artificial Intelligence (ICAI'21)

arXiv:2103.06909 [pdf, other]

doi 10.1145/3442442.3452327

Toward the Next Generation of News Recommender Systems

Authors: Himan Abdollahpouri, Edward Malthouse, Joseph Konstan, Bamshad Mobasher, Jeremy Gilbert

Abstract: This paper proposes a vision and research agenda for the next generation of news recommender systems (RS), called the table d'hote approach. A table d'hote (translates as host's table) meal is a sequence of courses that create a balanced and enjoyable dining experience for a guest. Likewise, we believe news RS should strive to create a similar experience for the users by satisfying the news-diet n… ▽ More This paper proposes a vision and research agenda for the next generation of news recommender systems (RS), called the table d'hote approach. A table d'hote (translates as host's table) meal is a sequence of courses that create a balanced and enjoyable dining experience for a guest. Likewise, we believe news RS should strive to create a similar experience for the users by satisfying the news-diet needs of a user. While extant news RS considers criteria such as diversity and serendipity, and RS bundles have been studied for other contexts such as tourism, table d'hote goes further by ensuring the recommended articles satisfy a diverse set of user needs in the right proportions and in a specific order. In table d'hote, available articles need to be stratified based on the different ways that news can create value for the reader, building from theories and empirical research in journalism and user engagement. Using theories and empirical research from communication on the uses and gratifications (U&G) consumers derive from media, we define two main strata in a table d'hote news RS, each with its own substrata: 1) surveillance, which consists of information the user needs to know, and 2) serendipity, which are the articles offering unexpected surprises. The diversity of the articles according to the defined strata and the order of the articles within the list of recommendations are also two important aspects of the table d'hote in order to give the users the most effective reading experience. We propose our vision, link it to the existing concepts in the RS literature, and identify challenges for future research. △ Less

Submitted 11 March, 2021; originally announced March 2021.

Comments: WWW '21 Companion, April 19-23, 2021, Ljubljana, Slovenia

arXiv:2103.06364 [pdf, other]

doi 10.1145/3450613.3456821

User-centered Evaluation of Popularity Bias in Recommender Systems

Authors: Himan Abdollahpouri, Masoud Mansoury, Robin Burke, Bamshad Mobasher, Edward Malthouse

Abstract: Recommendation and ranking systems are known to suffer from popularity bias; the tendency of the algorithm to favor a few popular items while under-representing the majority of other items. Prior research has examined various approaches for mitigating popularity bias and enhancing the recommendation of long-tail, less popular, items. The effectiveness of these approaches is often assessed using di… ▽ More Recommendation and ranking systems are known to suffer from popularity bias; the tendency of the algorithm to favor a few popular items while under-representing the majority of other items. Prior research has examined various approaches for mitigating popularity bias and enhancing the recommendation of long-tail, less popular, items. The effectiveness of these approaches is often assessed using different metrics to evaluate the extent to which over-concentration on popular items is reduced. However, not much attention has been given to the user-centered evaluation of this bias; how different users with different levels of interest towards popular items are affected by such algorithms. In this paper, we show the limitations of the existing metrics to evaluate popularity bias mitigation when we want to assess these algorithms from the users' perspective and we propose a new metric that can address these limitations. In addition, we present an effective approach that mitigates popularity bias from the user-centered point of view. Finally, we investigate several state-of-the-art approaches proposed in recent years to mitigate popularity bias and evaluate their performances using the existing metrics and also from the users' perspective. Our experimental results using two publicly-available datasets show that existing popularity bias mitigation techniques ignore the users' tolerance towards popular items. Our proposed user-centered method can tackle popularity bias effectively for different users while also improving the existing metrics. △ Less

Submitted 10 March, 2021; originally announced March 2021.

Comments: Proceedings of the 29th ACM Conference on User Modeling, Adaptation and Personalization (UMAP '21), June 21--25, 2021, Utrecht, Netherlands. arXiv admin note: text overlap with arXiv:2007.12230

arXiv:2008.09273 [pdf, other]

The Connection Between Popularity Bias, Calibration, and Fairness in Recommendation

Authors: Himan Abdollahpouri, Masoud Mansoury, Robin Burke, Bamshad Mobasher

Abstract: Recently there has been a growing interest in fairness-aware recommender systems including fairness in providing consistent performance across different users or groups of users. A recommender system could be considered unfair if the recommendations do not fairly represent the tastes of a certain group of users while other groups receive recommendations that are consistent with their preferences.… ▽ More Recently there has been a growing interest in fairness-aware recommender systems including fairness in providing consistent performance across different users or groups of users. A recommender system could be considered unfair if the recommendations do not fairly represent the tastes of a certain group of users while other groups receive recommendations that are consistent with their preferences. In this paper, we use a metric called miscalibration for measuring how a recommendation algorithm is responsive to users' true preferences and we consider how various algorithms may result in different degrees of miscalibration for different users. In particular, we conjecture that popularity bias which is a well-known phenomenon in recommendation is one important factor leading to miscalibration in recommendation. Our experimental results using two real-world datasets show that there is a connection between how different user groups are affected by algorithmic popularity bias and their level of interest in popular items. Moreover, we show that the more a group is affected by the algorithmic popularity bias, the more their recommendations are miscalibrated. △ Less

Submitted 20 August, 2020; originally announced August 2020.

Comments: Accepted at the 14th ACM Conference on Recommender Systems (RecSys 2020) Late Breaking Results Track. arXiv admin note: substantial text overlap with arXiv:1910.05755

arXiv:2008.06787 [pdf, other]

The Evaluation of Rating Systems in Online Free-for-All Games

Authors: Arman Dehpanah, Muheeb Faizan Ghori, Jonathan Gemmell, Bamshad Mobasher

Abstract: Online competitive games have become increasingly popular. To ensure an exciting and competitive environment, these games routinely attempt to match players with similar skill levels. Matching players is often accomplished through a rating system. There has been an increasing amount of research on develo** such rating systems. However, less attention has been given to the evaluation metrics of t… ▽ More Online competitive games have become increasingly popular. To ensure an exciting and competitive environment, these games routinely attempt to match players with similar skill levels. Matching players is often accomplished through a rating system. There has been an increasing amount of research on develo** such rating systems. However, less attention has been given to the evaluation metrics of these systems. In this paper, we present an exhaustive analysis of six metrics for evaluating rating systems in online competitive games. We compare traditional metrics such as accuracy. We then introduce other metrics adapted from the field of information retrieval. We evaluate these metrics against several well-known rating systems on a large real-world dataset of over 100,000 free-for-all matches. Our results show stark differences in their utility. Some metrics do not consider deviations between two ranks. Others are inordinately impacted by new players. Many do not capture the importance of distinguishing between errors in higher ranks and lower ranks. Among all metrics studied, we recommend Normalized Discounted Cumulative Gain (NDCG) because not only does it resolve the issues faced by other metrics, but it also offers flexibility to adjust the evaluations based on the goals of the system △ Less

Submitted 15 August, 2020; originally announced August 2020.

Comments: 10 pages, 1 figure, accepted and presented in 16th International Conference on Data Science (ICDATA'20)

arXiv:2007.13019 [pdf, other]

Feedback Loop and Bias Amplification in Recommender Systems

Authors: Masoud Mansoury, Himan Abdollahpouri, Mykola Pechenizkiy, Bamshad Mobasher, Robin Burke

Abstract: Recommendation algorithms are known to suffer from popularity bias; a few popular items are recommended frequently while the majority of other items are ignored. These recommendations are then consumed by the users, their reaction will be logged and added to the system: what is generally known as a feedback loop. In this paper, we propose a method for simulating the users interaction with the reco… ▽ More Recommendation algorithms are known to suffer from popularity bias; a few popular items are recommended frequently while the majority of other items are ignored. These recommendations are then consumed by the users, their reaction will be logged and added to the system: what is generally known as a feedback loop. In this paper, we propose a method for simulating the users interaction with the recommenders in an offline setting and study the impact of feedback loop on the popularity bias amplification of several recommendation algorithms. We then show how this bias amplification leads to several other problems such as declining the aggregate diversity, shifting the representation of users' taste over time and also homogenization of the users experience. In particular, we show that the impact of feedback loop is generally stronger for the users who belong to the minority group. △ Less

Submitted 25 July, 2020; originally announced July 2020.

arXiv:2007.12230 [pdf, other]

Addressing the Multistakeholder Impact of Popularity Bias in Recommendation Through Calibration

Authors: Himan Abdollahpouri, Masoud Mansoury, Robin Burke, Bamshad Mobasher

Abstract: Popularity bias is a well-known phenomenon in recommender systems: popular items are recommended even more frequently than their popularity would warrant, amplifying long-tail effects already present in many recommendation domains. Prior research has examined various approaches for mitigating popularity bias and enhancing the recommendation of long-tail items overall. The effectiveness of these ap… ▽ More Popularity bias is a well-known phenomenon in recommender systems: popular items are recommended even more frequently than their popularity would warrant, amplifying long-tail effects already present in many recommendation domains. Prior research has examined various approaches for mitigating popularity bias and enhancing the recommendation of long-tail items overall. The effectiveness of these approaches, however, has not been assessed in multistakeholder environments where in addition to the users who receive the recommendations, the utility of the suppliers of the recommended items should also be considered. In this paper, we propose the concept of popularity calibration which measures the match between the popularity distribution of items in a user's profile and that of the recommended items. We also develop an algorithm that optimizes this metric. In addition, we demonstrate that existing evaluation metrics for popularity bias do not reflect the performance of the algorithms when it is measured from the perspective of different stakeholders. Using music and movie datasets, we empirically show that our approach outperforms the existing state-of-the-art approaches in addressing popularity bias by calibrating the recommendations to users' preferences. We also show that our proposed algorithm has a secondary effect of improving supplier fairness. △ Less

Submitted 23 July, 2020; originally announced July 2020.

arXiv:2006.03715 [pdf, other]

doi 10.1145/3340631.3394858

Using Stable Matching to Optimize the Balance between Accuracy and Diversity in Recommendation

Authors: Farzad Eskandanian, Bamshad Mobasher

Abstract: Increasing aggregate diversity (or catalog coverage) is an important system-level objective in many recommendation domains where it may be desirable to mitigate the popularity bias and to improve the coverage of long-tail items in recommendations given to users. This is especially important in multistakeholder recommendation scenarios where it may be important to optimize utilities not just for th… ▽ More Increasing aggregate diversity (or catalog coverage) is an important system-level objective in many recommendation domains where it may be desirable to mitigate the popularity bias and to improve the coverage of long-tail items in recommendations given to users. This is especially important in multistakeholder recommendation scenarios where it may be important to optimize utilities not just for the end user, but also for other stakeholders such as item sellers or producers who desire a fair representation of their items across recommendation lists produced by the system. Unfortunately, attempts to increase aggregate diversity often result in lower recommendation accuracy for end users. Thus, addressing this problem requires an approach that can effectively manage the trade-offs between accuracy and aggregate diversity. In this work, we propose a two-sided post-processing approach in which both user and item utilities are considered. Our goal is to maximize aggregate diversity while minimizing loss in recommendation accuracy. Our solution is a generalization of the Deferred Acceptance algorithm which was proposed as an efficient algorithm to solve the well-known stable matching problem. We prove that our algorithm results in a unique user-optimal stable match between items and users. Using three recommendation datasets, we empirically demonstrate the effectiveness of our approach in comparison to several baselines. In particular, our results show that the proposed solution is quite effective in increasing aggregate diversity and item-side utility while optimizing recommendation accuracy for end users. △ Less

Submitted 5 June, 2020; originally announced June 2020.

arXiv:2005.12974 [pdf, other]

Opportunistic Multi-aspect Fairness through Personalized Re-ranking

Authors: Nasim Sonboli, Farzad Eskandanian, Robin Burke, Weiwen Liu, Bamshad Mobasher

Abstract: As recommender systems have become more widespread and moved into areas with greater social impact, such as employment and housing, researchers have begun to seek ways to ensure fairness in the results that such systems produce. This work has primarily focused on develo** recommendation approaches in which fairness metrics are jointly optimized along with recommendation accuracy. However, the pr… ▽ More As recommender systems have become more widespread and moved into areas with greater social impact, such as employment and housing, researchers have begun to seek ways to ensure fairness in the results that such systems produce. This work has primarily focused on develo** recommendation approaches in which fairness metrics are jointly optimized along with recommendation accuracy. However, the previous work had largely ignored how individual preferences may limit the ability of an algorithm to produce fair recommendations. Furthermore, with few exceptions, researchers have only considered scenarios in which fairness is measured relative to a single sensitive feature or attribute (such as race or gender). In this paper, we present a re-ranking approach to fairness-aware recommendation that learns individual preferences across multiple fairness dimensions and uses them to enhance provider fairness in recommendation results. Specifically, we show that our opportunistic and metric-agnostic approach achieves a better trade-off between accuracy and fairness than prior re-ranking approaches and does so across multiple fairness dimensions. △ Less

Submitted 21 May, 2020; originally announced May 2020.

arXiv:2005.01148 [pdf, other]

FairMatch: A Graph-based Approach for Improving Aggregate Diversity in Recommender Systems

Authors: Masoud Mansoury, Himan Abdollahpouri, Mykola Pechenizkiy, Bamshad Mobasher, Robin Burke

Abstract: Recommender systems are often biased toward popular items. In other words, few items are frequently recommended while the majority of items do not get proportionate attention. That leads to low coverage of items in recommendation lists across users (i.e. low aggregate diversity) and unfair distribution of recommended items. In this paper, we introduce FairMatch, a general graph-based algorithm tha… ▽ More Recommender systems are often biased toward popular items. In other words, few items are frequently recommended while the majority of items do not get proportionate attention. That leads to low coverage of items in recommendation lists across users (i.e. low aggregate diversity) and unfair distribution of recommended items. In this paper, we introduce FairMatch, a general graph-based algorithm that works as a post-processing approach after recommendation generation for improving aggregate diversity. The algorithm iteratively finds items that are rarely recommended yet are high-quality and add them to the users' final recommendation lists. This is done by solving the maximum flow problem on the recommendation bipartite graph. While we focus on aggregate diversity and fair distribution of recommended items, the algorithm can be adapted to other recommendation scenarios using different underlying definitions of fairness. A comprehensive set of experiments on two datasets and comparison with state-of-the-art baselines show that FairMatch, while significantly improving aggregate diversity, provides comparable recommendation accuracy. △ Less

Submitted 3 May, 2020; originally announced May 2020.

arXiv:2002.07786 [pdf, other]

Investigating Potential Factors Associated with Gender Discrimination in Collaborative Recommender Systems

Authors: Masoud Mansoury, Himan Abdollahpouri, Jessie Smith, Arman Dehpanah, Mykola Pechenizkiy, Bamshad Mobasher

Abstract: The proliferation of personalized recommendation technologies has raised concerns about discrepancies in their recommendation performance across different genders, age groups, and racial or ethnic populations. This varying degree of performance could impact users' trust in the system and may pose legal and ethical issues in domains where fairness and equity are critical concerns, like job recommen… ▽ More The proliferation of personalized recommendation technologies has raised concerns about discrepancies in their recommendation performance across different genders, age groups, and racial or ethnic populations. This varying degree of performance could impact users' trust in the system and may pose legal and ethical issues in domains where fairness and equity are critical concerns, like job recommendation. In this paper, we investigate several potential factors that could be associated with discriminatory performance of a recommendation algorithm for women versus men. We specifically study several characteristics of user profiles and analyze their possible associations with disparate behavior of the system towards different genders. These characteristics include the anomaly in rating behavior, the entropy of users' profiles, and the users' profile size. Our experimental results on a public dataset using four recommendation algorithms show that, based on all the three mentioned factors, women get less accurate recommendations than men indicating an unfair nature of recommendation algorithms across genders. △ Less

Submitted 18 February, 2020; originally announced February 2020.

arXiv:1910.05755 [pdf, other]

The Impact of Popularity Bias on Fairness and Calibration in Recommendation

Authors: Himan Abdollahpouri, Masoud Mansoury, Robin Burke, Bamshad Mobasher

Abstract: Recently there has been a growing interest in fairness-aware recommender systems, including fairness in providing consistent performance across different users or groups of users. A recommender system could be considered unfair if the recommendations do not fairly represent the tastes of a certain group of users while other groups receive recommendations that are consistent with their preferences.… ▽ More Recently there has been a growing interest in fairness-aware recommender systems, including fairness in providing consistent performance across different users or groups of users. A recommender system could be considered unfair if the recommendations do not fairly represent the tastes of a certain group of users while other groups receive recommendations that are consistent with their preferences. In this paper, we use a metric called miscalibration for measuring how a recommendation algorithm is responsive to users' true preferences and we consider how various algorithms may result in different degrees of miscalibration. A well-known type of bias in recommendation is popularity bias where few popular items are over-represented in recommendations, while the majority of other items do not get significant exposure. We conjecture that popularity bias is one important factor leading to miscalibration in recommendation. Our experimental results using two real-world datasets show that there is a strong correlation between how different user groups are affected by algorithmic popularity bias and their level of interest in popular items. Moreover, we show algorithms with greater popularity bias amplification tend to have greater miscalibration. △ Less

Submitted 15 October, 2019; v1 submitted 13 October, 2019; originally announced October 2019.

arXiv:1909.06362 [pdf, other]

Crank up the volume: preference bias amplification in collaborative recommendation

Authors: Kun Lin, Nasim Sonboli, Bamshad Mobasher, Robin Burke

Abstract: Recommender systems are personalized: we expect the results given to a particular user to reflect that user's preferences. Some researchers have studied the notion of calibration, how well recommendations match users' stated preferences, and bias disparity the extent to which mis-calibration affects different user groups. In this paper, we examine bias disparity over a range of different algorithm… ▽ More Recommender systems are personalized: we expect the results given to a particular user to reflect that user's preferences. Some researchers have studied the notion of calibration, how well recommendations match users' stated preferences, and bias disparity the extent to which mis-calibration affects different user groups. In this paper, we examine bias disparity over a range of different algorithms and for different item categories and demonstrate significant differences between model-based and memory-based algorithms. △ Less

Submitted 12 September, 2019; originally announced September 2019.

Comments: Presented at the RMSE workshop held in conjunction with the 13th ACM Conference on Recommender Systems (RecSys), 2019, in Copenhagen, Denmark

arXiv:1908.00831 [pdf, other]

Bias Disparity in Collaborative Recommendation: Algorithmic Evaluation and Comparison

Authors: Masoud Mansoury, Bamshad Mobasher, Robin Burke, Mykola Pechenizkiy

Abstract: Research on fairness in machine learning has been recently extended to recommender systems. One of the factors that may impact fairness is bias disparity, the degree to which a group's preferences on various item categories fail to be reflected in the recommendations they receive. In some cases biases in the original data may be amplified or reversed by the underlying recommendation algorithm. In… ▽ More Research on fairness in machine learning has been recently extended to recommender systems. One of the factors that may impact fairness is bias disparity, the degree to which a group's preferences on various item categories fail to be reflected in the recommendations they receive. In some cases biases in the original data may be amplified or reversed by the underlying recommendation algorithm. In this paper, we explore how different recommendation algorithms reflect the tradeoff between ranking quality and bias disparity. Our experiments include neighborhood-based, model-based, and trust-aware recommendation algorithms. △ Less

Submitted 2 August, 2019; originally announced August 2019.

Comments: Workshop on Recommendation in Multi-Stakeholder Environments (RMSE) at ACM RecSys 2019, Copenhagen, Denmark

arXiv:1907.13286 [pdf, other]

The Unfairness of Popularity Bias in Recommendation

Authors: Himan Abdollahpouri, Masoud Mansoury, Robin Burke, Bamshad Mobasher

Abstract: Recommender systems are known to suffer from the popularity bias problem: popular (i.e. frequently rated) items get a lot of exposure while less popular ones are under-represented in the recommendations. Research in this area has been mainly focusing on finding ways to tackle this issue by increasing the number of recommended long-tail items or otherwise the overall catalog coverage. In this paper… ▽ More Recommender systems are known to suffer from the popularity bias problem: popular (i.e. frequently rated) items get a lot of exposure while less popular ones are under-represented in the recommendations. Research in this area has been mainly focusing on finding ways to tackle this issue by increasing the number of recommended long-tail items or otherwise the overall catalog coverage. In this paper, however, we look at this problem from the users' perspective: we want to see how popularity bias causes the recommendations to deviate from what the user expects to get from the recommender system. We define three different groups of users according to their interest in popular items (Niche, Diverse and Blockbuster-focused) and show the impact of popularity bias on the users in each group. Our experimental results on a movie dataset show that in many recommendation algorithms the recommendations the users get are extremely concentrated on popular items even if a user is interested in long-tail and non-popular items showing an extreme bias disparity. △ Less

Submitted 19 September, 2019; v1 submitted 30 July, 2019; originally announced July 2019.

arXiv:1907.07766 [pdf, other]

Flatter is better: Percentile Transformations for Recommender Systems

Authors: Masoud Mansoury, Robin Burke, Bamshad Mobasher

Abstract: It is well known that explicit user ratings in recommender systems are biased towards high ratings, and that users differ significantly in their usage of the rating scale. Implementers usually compensate for these issues through rating normalization or the inclusion of a user bias term in factorization models. However, these methods adjust only for the central tendency of users' distributions. In… ▽ More It is well known that explicit user ratings in recommender systems are biased towards high ratings, and that users differ significantly in their usage of the rating scale. Implementers usually compensate for these issues through rating normalization or the inclusion of a user bias term in factorization models. However, these methods adjust only for the central tendency of users' distributions. In this work, we demonstrate that lack of \textit{flatness} in rating distributions is negatively correlated with recommendation performance. We propose a rating transformation model that compensates for skew in the rating distribution as well as its central tendency by converting ratings into percentile values as a pre-processing step before recommendation generation. This transformation flattens the rating distribution, better compensates for differences in rating distributions, and improves recommendation performance. We also show a smoothed version of this transformation designed to yield more intuitive results for users with very narrow rating distributions. A comprehensive set of experiments show improved ranking performance for these percentile transformations with state-of-the-art recommendation algorithms in four real-world data sets. △ Less

Submitted 10 July, 2019; originally announced July 2019.

arXiv:1905.08031 [pdf, other]

doi 10.1145/3320435.3320464

Power of the Few: Analyzing the Impact of Influential Users in Collaborative Recommender Systems

Authors: Farzad Eskandanian, Nasim Sonboli, Bamshad Mobasher

Abstract: Like other social systems, in collaborative filtering a small number of "influential" users may have a large impact on the recommendations of other users, thus affecting the overall behavior of the system. Identifying influential users and studying their impact on other users is an important problem because it provides insight into how small groups can inadvertently or intentionally affect the beh… ▽ More Like other social systems, in collaborative filtering a small number of "influential" users may have a large impact on the recommendations of other users, thus affecting the overall behavior of the system. Identifying influential users and studying their impact on other users is an important problem because it provides insight into how small groups can inadvertently or intentionally affect the behavior of the system as a whole. Modeling these influences can also shed light on patterns and relationships that would otherwise be difficult to discern, hopefully leading to more transparency in how the system generates personalized content. In this work we first formalize the notion of "influence" in collaborative filtering using an Influence Discrimination Model. We then empirically identify and characterize influential users and analyze their impact on the system under different underlying recommendation algorithms and across three different recommendation domains: job, movie and book recommendations. Insights from these experiments can help in designing systems that are not only optimized for accuracy, but are also tuned to mitigate the impact of influential users when it might lead to potential imbalance or unfairness in the system's outcomes. △ Less

Submitted 14 May, 2019; originally announced May 2019.

arXiv:1905.06863 [pdf, other]

Modeling the Dynamics of User Preferences for Sequence-Aware Recommendation Using Hidden Markov Models

Authors: Farzad Eskandanian, Bamshad Mobasher

Abstract: In a variety of online settings involving interaction with end-users it is critical for the systems to adapt to changes in user preferences. User preferences on items tend to change over time due to a variety of factors such as change in context, the task being performed, or other short-term or long-term external factors. Recommender systems need to be able to capture these dynamics in user prefer… ▽ More In a variety of online settings involving interaction with end-users it is critical for the systems to adapt to changes in user preferences. User preferences on items tend to change over time due to a variety of factors such as change in context, the task being performed, or other short-term or long-term external factors. Recommender systems need to be able to capture these dynamics in user preferences in order to remain tuned to the most current interests of users. In this work we present a recommendation framework which takes into account the dynamics of user preferences. We propose an approach based on Hidden Markov Models (HMM) to identify change-points in the sequence of user interactions which reflect significant changes in preference according to the sequential behavior of all the users in the data. The proposed framework leverages the identified change points to generate recommendations using a sequence-aware non-negative matrix factorization model. We empirically demonstrate the effectiveness of the HMM-based change detection method as compared to standard baseline methods. Additionally, we evaluate the performance of the proposed recommendation method and show that it compares favorably to state-of-the-art sequence-aware recommendation models. △ Less

Submitted 14 May, 2019; originally announced May 2019.

Comments: arXiv admin note: substantial text overlap with arXiv:1810.00272

arXiv:1901.07555 [pdf, other]

Managing Popularity Bias in Recommender Systems with Personalized Re-ranking

Authors: Himan Abdollahpouri, Robin Burke, Bamshad Mobasher

Abstract: Many recommender systems suffer from popularity bias: popular items are recommended frequently while less popular, niche products, are recommended rarely or not at all. However, recommending the ignored products in the `long tail' is critical for businesses as they are less likely to be discovered. In this paper, we introduce a personalized diversification re-ranking approach to increase the repre… ▽ More Many recommender systems suffer from popularity bias: popular items are recommended frequently while less popular, niche products, are recommended rarely or not at all. However, recommending the ignored products in the `long tail' is critical for businesses as they are less likely to be discovered. In this paper, we introduce a personalized diversification re-ranking approach to increase the representation of less popular items in recommendations while maintaining acceptable recommendation accuracy. Our approach is a post-processing step that can be applied to the output of any recommender system. We show that our approach is capable of managing popularity bias more effectively, compared with an existing method based on regularization. We also examine both new and existing metrics to measure the coverage of long-tail items in the recommendation. △ Less

Submitted 12 August, 2019; v1 submitted 22 January, 2019; originally announced January 2019.

Comments: arXiv admin note: text overlap with arXiv:1802.05382

arXiv:1810.00272 [pdf, other]

Detecting Changes in User Preferences using Hidden Markov Models for Sequential Recommendation Tasks

Authors: Farzad Eskandanian, Bamshad Mobasher

Abstract: Recommender systems help users find relevant items of interest based on the past preferences of those users. In many domains, however, the tastes and preferences of users change over time due to a variety of factors and recommender systems should capture these dynamics in user preferences in order to remain tuned to the most current interests of users. In this work we present a recommendation fram… ▽ More Recommender systems help users find relevant items of interest based on the past preferences of those users. In many domains, however, the tastes and preferences of users change over time due to a variety of factors and recommender systems should capture these dynamics in user preferences in order to remain tuned to the most current interests of users. In this work we present a recommendation framework based on Hidden Markov Models (HMM) which takes into account the dynamics of user preferences. We propose a HMM-based approach to change point detection in the sequence of user interactions which reflect significant changes in preference according to the sequential behavior of all the users in the data. The proposed framework leverages the identified change points to generate recommendations in two ways. In one approach change points are used to create a sequence-aware non-negative matrix factorization model to generate recommendations that are aligned with the current tastes of user. In the second approach the HMM is used directly to generate recommendations taking into account the identified change points. These models are evaluated in terms of accuracy of change point detection and also the effectiveness of recommendations using a real music streaming dataset. △ Less

Submitted 29 September, 2018; originally announced October 2018.

Comments: 7 pages, 4 figures, RecSysKTL, Workshop on Intelligent Recommender Systems by Knowledge Transfer and Learning

arXiv:1802.05382 [pdf, other]

Popularity-Aware Item Weighting for Long-Tail Recommendation

Authors: Himan Abdollahpouri, Robin Burke, Bamshad Mobasher

Abstract: Many recommender systems suffer from the popularity bias problem: popular items are being recommended frequently while less popular, niche products, are recommended rarely if not at all. However, those ignored products are exactly the products that businesses need to find customers for and their recommendations would be more beneficial. In this paper, we examine an item weighting approach to impro… ▽ More Many recommender systems suffer from the popularity bias problem: popular items are being recommended frequently while less popular, niche products, are recommended rarely if not at all. However, those ignored products are exactly the products that businesses need to find customers for and their recommendations would be more beneficial. In this paper, we examine an item weighting approach to improve long-tail recommendation. Our approach works as a simple yet powerful add-on to existing recommendation algorithms for making a tunable trade-off between accuracy and long-tail coverage. △ Less

Submitted 5 December, 2018; v1 submitted 14 February, 2018; originally announced February 2018.

arXiv:1703.00034 [pdf, other]

Weighted Random Walk Sampling for Multi-Relational Recommendation

Authors: Fatemeh Vahedian, Robin Burke, Bamshad Mobasher

Abstract: In the information overloaded web, personalized recommender systems are essential tools to help users find most relevant information. The most heavily-used recommendation frameworks assume user interactions that are characterized by a single relation. However, for many tasks, such as recommendation in social networks, user-item interactions must be modeled as a complex network of multiple relation… ▽ More In the information overloaded web, personalized recommender systems are essential tools to help users find most relevant information. The most heavily-used recommendation frameworks assume user interactions that are characterized by a single relation. However, for many tasks, such as recommendation in social networks, user-item interactions must be modeled as a complex network of multiple relations, not only a single relation. Recently research on multi-relational factorization and hybrid recommender models has shown that using extended meta-paths to capture additional information about both users and items in the network can enhance the accuracy of recommendations in such networks. Most of this work is focused on unweighted heterogeneous networks, and to apply these techniques, weighted relations must be simplified into binary ones. However, information associated with weighted edges, such as user ratings, which may be crucial for recommendation, are lost in such binarization. In this paper, we explore a random walk sampling method in which the frequency of edge sampling is a function of edge weight, and apply this generate extended meta-paths in weighted heterogeneous networks. With this sampling technique, we demonstrate improved performance on multiple data sets both in terms of recommendation accuracy and model generation efficiency. △ Less

Submitted 2 March, 2017; v1 submitted 28 February, 2017; originally announced March 2017.

Showing 1–32 of 32 results for author: Mobasher, B