\useunder

\ul

¹¹institutetext: State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China, Hefei, China
¹¹email: {zhanglk5,mingjia-yin,harley,jiaqing.zhang}@mail.ustc.edu.cn
¹¹email: {wanghao3,liandefu,cheneh}@ustc.edu.cn ²²institutetext: Army Engineering University of PLA, Nan**g, China
²²email: [email protected]

A Unified Framework for Adaptive Representation Enhancement and Inversed Learning in Cross-Domain Recommendation

Luankang Zhang 11 Hao Wang^(✉) 11 Suojuan Zhang 22 Mingjia Yin 11 Yongqiang Han 11 Jiaqing Zhang 11 Defu Lian 11 Enhong Chen 11

Abstract

Cross-domain recommendation (CDR), aiming to extract and transfer knowledge across domains, has attracted wide attention for its efficacy in addressing data sparsity and cold-start problems. Despite significant advances in representation disentanglement to capture diverse user preferences, existing methods usually neglect representation enhancement and lack rigorous decoupling constraints, thereby limiting the transfer of relevant information. To this end, we propose a Unified Framework for Adaptive Representation Enhancement and Inversed Learning in Cross-Domain Recommendation (AREIL). Specifically, we first divide user embeddings into domain-shared and domain-specific components to disentangle mixed user preferences. Then, we incorporate intra-domain and inter-domain information to adaptively enhance the ability of user representations. In particular, we propose a graph convolution module to capture high-order information, and a self-attention module to reveal inter-domain correlations and accomplish adaptive fusion. Next, we adopt domain classifiers and gradient reversal layers to achieve inversed representation learning in a unified framework. Finally, we employ a cross-entropy loss for measuring recommendation performance and jointly optimize the entire framework via multi-task learning. Extensive experiments on multiple datasets validate the substantial improvement in the recommendation performance of AREIL. Moreover, ablation studies and representation visualizations further illustrate the effectiveness of adaptive enhancement and inversed learning in CDR.

Keywords:

Recommendation System Cross-domain Recommendation Disentanglement Learning

1 Introduction

The recommendation system (RS) has been developed to personalize recommendations by modeling users’ preferences based on historical interactions. While attracting wide attention in both academia and industry, RS usually faces data sparsity and cold start issues in real-world applications. To address these challenges, cross-domain recommendation (CDR) has emerged, which improves recommendation performance by transferring knowledge across domains [40].

CDR aims to model and transfer knowledge across domains, where it is necessary to capture domain-invariant user preferences. Early studies assume that users share consistent preferences across domains, thus directly adapting conventional recommendation methods (matrix factorization [21, 35], clustering [5, 13, 18, 44], and GNN [45]) to the CDR scenario. Considering diverse user preferences across domains, some studies adopt a two-step paradigm. They first model single-domain preferences using dedicated encoders and then enable cross-domain knowledge transfer by introducing transfer layers. For instance, CoNet [8] employs Multi-Layer Perceptron (MLP) to encode user preferences and facilitates domain-to-domain information transfer through a dual connection module. BiTGCF [16] adopts a graph-driven framework to model and transfer high-order collaborative information across domains. These approaches ignore the differences in user preferences across domains, potentially leading to the negative transfer problem [48]. Recognizing this, recent research focuses on explicitly separating user preferences into domain-shared and domain-specific components, with the former being transferred across domains. For example, DisenCDR [2] utilizes Variational Autoencoders (VAE) to create embeddings shared across domains and domain-specific embeddings. Additionally, it incorporates two regularizers based on mutual information to supervise the disentanglement process. Similarly, DR-MTCDR [3] employs a graph neural network to enhance information and disentangle representations with self-supervised learning.

Despite advanced performance achieved by disentanglement-based CDR methods, they exhibit limitations in two crucial aspects: (1) Adaptive Representation Enhancement. Existing works [3, 14, 27, 42] generally assume that domain-shared representations of the same user are completely consistent across domains, leading to an inability to capture diverse user interests. Consequently, they fail to explicitly distinguish the substantial variation in embedding quality caused by imbalanced data distribution and inconsistent domain-shared user preferences across domains. It’s a non-trivial challenge to explore the correlations of representations between domains and adaptively transfer important and generalizable knowledge. (2) Inversed Representation Learning. While several studies employ self-supervised methods like VAE [2] and contrastive learning [3, 47] to achieve mutual exclusion of disentangled representations, they cannot guarantee that domain-shared and domain-specific factors are assigned to the corresponding representations respectively. Ideally, these factors should exhibit an inverse relationship, encoding complementary information. Implementing such inversed constraints within a unified framework for learning disentangled user representations remains a critical challenge.

To tackle the aforementioned challenges, in this paper, we propose a Unified Framework for Adaptive Representation Enhancement and Inversed Learning in Cross-Domain Recommendation, denoted as AREIL. Specifically, we first initialize item and user representations and then divide user representations into domain-shared and domain-specific components for preference disentanglement. Second, we enhance user representations through an Adaptive Representation Enhancement Module (AREM). In particular, the intra-domain AREM constructs a user-item interaction graph to capture high-order collaborative information via iterative aggregation. Following this, the inter-domain AREM employs self-attention to evaluate cross-domain relevance and adaptively transfer processed domain-shared embeddings, focusing on important and general components. Third, we propose the Inversed Representation Learning Module (IRLM) for learning disentangled user preferences in a unified framework, employing domain classifiers and gradient reversal layers (GRL). The GRL reverses the gradient direction to achieve the inversed constraint objective. The domain classifier distinguishes the source of input and generates a comprehensive supervision signal for disentanglement constraints. Finally, we leverage a multi-task learning paradigm to optimize the entire framework in an end-to-end manner. In summary, the main contributions of this article can be listed as follows:

$\bullet$

We study the dual-target cross-domain recommendation problem from a novel perspective, which focuses on adaptive representation enhancement and inversed learning for user preferences disentanglement.
$\bullet$

To enhance user representations, we propose an adaptive representation enhancement module. This module enables the exploration of inter-domain relevance, allowing for the adaptive transfer of important and general factors that contribute to improved recommendation performance.
$\bullet$

We leverage domain classifiers and gradient reversal layers within the inversed representation learning module to constrain the disentanglement of user preferences in a unified framework.
$\bullet$

Extensive experiments confirm AREIL consistently outperforms state-of-the-art models. Supplementary ablation studies and representation visualizations further indicate our module can learn more informative and precise disentangled user representations in the cross-domain recommendation.

2 Related Work

2.1 Cross-Domain Recommendation

Recommendation systems [4, 5, 25, 33, 37] typically face challenges associated with data sparsity and cold-start scenarios. Cross-domain recommendation (CDR) has emerged as a promising approach that leverages knowledge from a source domain to improve recommendation performance in a target domain [40].

Within the field of CDR, the central focus is on modeling domain-invariant user preferences and facilitating information transfer across diverse domains. Early approaches primarily extended single-domain recommendation methods, assuming that users share the same interests across different domains. For instance, techniques such as matrix factorization [20], graph-based methods [7, 26, 28], and contrastive learning [23, 34, 36, 38] are applied to individual domains, employing a basic transfer module to integrate information across domains. Although these modules outperform single-domain recommendation methods in performance, they disregard variations in user interests across domains and give rise to the negative transfer problem [48]. Some methods utilize two separate encoders to model user interests and introduce more complex transfer modules. For example, CoNet [8] introduced dual connections in a Multi-Layer Perceptron (MLP) network to achieve deep bidirectional knowledge migration. DARec [39] utilized autoencoders and adversarial learning for knowledge transfer among shared users. DDTCDR [15] introduced a deep dual transmission network with cross-domain implicit orthogonal map** to preserve the similarity of user preferences across domains. Furthermore, subsequent research introduced Graph Neural Network (GNN)-based transfer modules, incorporating higher-order information into the transfer process. For example, PPGN [45] captures the multi-hop propagation of user preferences, explicitly models cross-domain interactions and preserves the structural information. BiTGCF [16] leverages higher-order connectivity in single-domain user-item graphs through a novel feature propagation layer, facilitating bidirectional knowledge exchange between two domains by utilizing overlapped users as intermediaries. Despite advancements in encoder architectures, existing methods fail to adequately capture the diversity of user preferences across domains, leading to suboptimal recommendation performance.

2.2 Disentanglement Learning in CDR

To further alleviate the negative transfer problem, disentanglement-based CDR methods [19, 24, 31, 32, 43] have gained prominence. These methods disentangle features into domain-specific and domain-shared components and solely migrate domain-shared information, which mitigates potential interference resulting from the migration of domain-specific information. Variational AutoEncoder (VAE) [1] is known to be effective in disentangling, motivating works like DisenCDR [2] and MTNet [9] to disentangle features with different disentanglement objectives. Adversarial learning has also been introduced to CDR. For example, DA-CDR [41] controlled the decoupling process with orthogonality constraints, while DIDA-CDR [49] introduced the concept of “domain-independent information”, challenging the accuracy of encoding domain-shared representations and achieves more precise decoupling. Besides, some works [3, 12, 42] proposed to disentangle representations in a Self-Supervised Learning fashion, which provided valuable guidance for disentanglement in the context of data sparsity. However, incomplete disentanglement in these methods, due to either limited decoupling control or insufficient mutual exclusion, results in inadvertent feature transfer across domains and reduced recommendation performance.

3 Problem Definition

In this section, we will provide formal definitions to facilitate a more precise explanation of the problem. Given two domains, denoted as $X$ and $Y$ , CDR leverages information from a relatively richer domain to improve performance in a sparser domain. This is achieved by capturing domain-invariant user preferences and transferring information across domains. In the following, we mainly focus on the dual-target CDR. The formalization of the problem is as follows:

Definition (Dual-target Cross-Domain Recommendation). We consider a general scenario wherein users are fully overlapped while items remain non-overlapped. Let $\mathcal{D}^{X}=(\mathcal{U},\mathcal{V}^{X},\mathcal{E}^{X})$ and $\mathcal{D}^{Y}=(\mathcal{U},\mathcal{V}^{Y},\mathcal{E}^{Y})$ denote the interaction data of domain $X$ and $Y$ respectively. Specifically, $\mathcal{U}$ represents the shared user set between the two domains, with a size ${|\mathcal{U}|}$ . Furthermore, $\mathcal{V}^{X}$ and $\mathcal{V}^{Y}$ indicate the non-overlap** item sets specific to domain $X$ and domain $Y$ , respectively. The interactions between users and items are captured using the edge sets $\mathcal{E}^{X}$ and $\mathcal{E}^{Y}$ . Moreover, binary interaction matrices $A^{X}\in\{0,1\}^{|\mathcal{U}|\times|\mathcal{V}^{X}|}$ and $A^{Y}\in\{0,1\}^{|\mathcal{U}|\times|\mathcal{V}^{Y}|}$ are employed to encode the user-item interactions in domains $X$ and $Y$ , where $A_{ij}=1$ indicates an interaction between user $u_{i}\in\mathcal{U}$ and item $v_{j}\in\mathcal{V}$ , while $A_{ij}=0$ indicates the absence of such interaction. Then, given $(\mathcal{D}^{X},\mathcal{D}^{Y},A^{X},A^{Y})$ , The dual-target CDR aims to improve the performance of both domains $X$ and $Y$ simultaneously.

Traditional dual-target CDR methods neglect the variation in user preferences across domains, thereby limiting the efficacy of knowledge transfer. To capture diverse user preferences, there arises a necessity to disentangle user interests, a process that can be precisely formalized as follows:

Definition (User Preferences Disentanglement). Given the user representation $\mathbf{Z}_{u}^{X}$ in domain $X$ , we disentangle it into domain-shared $\mathbf{Z}_{\text{u,sha}}^{X}$ and domain-specific components $\mathbf{Z}_{\text{{u,spe}}}^{X}$ , respectively. Similarly, we disentangle the user representation $\mathbf{Z}_{u}^{Y}$ in domain $Y$ into $\mathbf{Z}_{\text{{u,sha}}}^{Y}$ and $\mathbf{Z}_{\text{{u,spe}}}^{Y}$ . The disentangled representation is expected to encode corresponding information, wherein domain-shared encoding denotes domain-invariant information, and domain-specific encoding conveys domain-related information.

Current disentanglement methods overlook representation enhancement and lack sufficient decoupling constraints. We present an end-to-end framework for addressing these issues. The methodological details are fully described in Sec. 4.

4 Methodology

Refer to caption — Figure 1: The AREIL framework for adaptive representation enhancement and inversed representation learning.

In this section, we introduce a unified framework for Adaptive Representation Enhancement and Inversed Learning in Cross-Domain Recommendation (AREIL) to achieve adaptive representation enhancement and inversed representation learning. The framework is illustrated in Fig. 1. In Sec. 4.1, AREIL initially disentangles mixed user preferences into domain-shared and domain-specific components by splitting user representations equally. Moving forward to Sec. 4.2, the Adaptive Representation Enhancement Module (AREM) is designed to enhance the modeling of user preferences. In particular, intra-domain high-order information is incorporated by utilizing LightGCN to aggregate neighbor node representations, and the inter-domain AREM adaptively facilitates the transfer of important and general factors across domains by self-attention. Additionally, Sec. 4.3 introduces domain classifiers and gradient reversal layers to learn disentangled user representations in a unified framework by generating effective self-supervised signals. Finally, we make predictions and optimize the entire framework using a multi-task learning paradigm.

4.1 Disentanglement-based Embedding Layer

In traditional cross-domain recommendation methods [8, 39], it is assumed that users maintain consistent interests across different domains, which often results in negative transfer because it fails to effectively capture diverse user preferences. To overcome this limitation, we introduce a Disentanglement-based Embedding Layer that segregates user preferences into two components: domain-shared and domain-specific. Specifically, considering domain $X$ , we evenly partition $\mathbf{Z}_{u}^{X}\in\mathbb{R}^{|\mathcal{U}|\times d}$ into two components: domain-shared user embeddings $\mathbf{Z}_{\text{{{u,sha}}}}^{X}\in\mathbb{R}^{|\mathcal{U}|\times d/2}$ and domain-specific user embeddings $\mathbf{Z}_{\text{{{u,spe}}}}^{X}\in\mathbb{R}^{|\mathcal{U}|\times d/2}$ , i.e. $\mathbf{Z}_{u}^{X}=\mathbf{Z}_{\text{{{u,sha}}}}^{X}\|\mathbf{Z}_{\text{{{u,% spe}}}}^{X}$ . The even number $d$ denotes the feature dimension. The Disentanglement-based Embedding Layer serves as the foundational basis for our model, defining domain-shared preferences separately for each domain. This design enables the embeddings to express the varying importance and generality of domain-independent features within each domain. The same process is applied to domain $Y$ . Moreover, we utilize $\mathbf{Z}_{v}^{X}\in\mathbb{R}^{|\mathcal{V}^{X}|\times d}$ and $\mathbf{Z}_{v}^{Y}\in\mathbb{R}^{|\mathcal{V}^{Y}|\times d}$ to represent item sets.

4.2 Adaptive Representation Enhancement Module

Given the disentangled embeddings, we employ the Adaptive Representation Enhancement Module (AREM) to enhance their ability by incorporating both intra-domain and inter-domain information.

4.2.1 Intra-domain Enhancement.

To capture high-order collaborative information, we construct a heterogeneous bipartite graph by leveraging user-item interactions. Subsequently, we apply LightGCN [7] to enhance node representations through the aggregation of embeddings from adjacent nodes. In the context of domain $X$ , the linear propagation procedure can be summarized as follows:

\mathbf{Z}_{u}^{X,k+1}=\sum_{i\in\mathcal{N}_{u}^{X}}\frac{1}{\sqrt{\left|% \mathcal{N}_{u}^{X}\right|\left|\mathcal{N}_{i}^{X}\right|}}\mathbf{Z}_{i}^{X,% k},\\

(1)

where $k$ represents the current graph convolutional layer, $\mathcal{N}_{u}$ represents the set of neighboring nodes in domain $X$ for the target node $u$ , and $\mathbf{Z}_{i}^{X,k}$ denotes the embedding of the node $i$ in $k$ -th layer.

To fuse distinct semantic information, we concatenate the representations obtained from different $K$ layers, resulting in enhanced node representation $\mathbf{Z}_{u}^{X}$ :

\mathbf{Z}_{u}^{X}=\mathbf{Z}_{u}^{X,0}\|\ldots\|\mathbf{Z}_{u}^{X,K}.

(2)

4.2.2 Inter-domain Enhancement.

To address significant data sparsity issues in CDR, we further enhance representations by transferring information across domains. Unfortunately, CDR frequently encounters significant variations in representation quality due to data skewness, where a few users contribute to most interaction records. Additionally, preferences existing simultaneously in both domains may exhibit distinct behaviors. These characteristics make it impractical to simply transfer domain-shared representations. To address these issues, we introduce the Inter-domain Adaptive Representation Enhancement Module (Inter-domain AREM). This module employs a self-attention mechanism [22] to explore correlations between domains and adaptively transfer important and general knowledge at the node level. For simplicity, we will illustrate the procedure using domain $X$ , and a similar process can be extended to domain $Y$ .

To illustrate the inherent relationship between domain $X$ and domain $Y$ , we introduce the attention matrix $ATT^{X}$ . Specifically, given the domain-shared user representations $\mathbf{Z}_{\text{{{u,sha}}}}^{X}\in\mathbb{R}^{|\mathcal{U}|\times d/2}$ from domain $X$ and $\mathbf{Z}_{\text{{{u,sha}}}}^{Y}\in\mathbb{R}^{|\mathcal{U}|\times d/2}$ from domain $Y$ , we consider them as sequences of length 2. These sequences are combined to form $\mathbf{Z}_{\text{{u}}}\in\mathbb{R}^{|\mathcal{U}|\times d}$ , i.e. $\mathbf{Z}_{\text{{u}}}=\mathbf{Z}_{\text{{u,sha}}}^{X}\|\mathbf{Z}_{\text{{u,% sha}}}^{Y}$ . Next, we introduce the learnable query weight matrix $W_{Q}^{X}\in\mathbb{R}^{d\times d}$ and key weight matrix $W_{K}^{X}\in\mathbb{R}^{d\times d}$ . We then obtain query vector $\mathbf{Q}^{X}$ and key vector $\mathbf{K}^{X}$ :

\mathbf{Q}^{X}=\mathbf{Z}_{\text{{u}}}W_{Q}^{X},\quad\mathbf{K}^{X}=\mathbf{Z}% _{\text{{u}}}W_{K}^{X}.

(3)

Therefore, we get the attention matrix $ATT^{X}$ by calculating the dot product between the query vector $\mathbf{Q}^{X}$ and the key vector $\mathbf{K}^{X}$ , i.e. $ATT^{X}=(\mathbf{Q}^{X})^{T}\mathbf{K}^{X}$ .

Based on the attention matrix $ATT^{X}$ , we introduce a gating mechanism to emphasize the important and general components in $\mathbf{Z}_{\text{{u,sha}}}^{Y}$ . Specifically, we calculate the feature-related distribution $p$ by summing the attention matrix $ATT^{X}$ along the query dimensions. A higher value of $p$ indicates a stronger correlation between the corresponding elements in $\mathbf{Z}_{\text{{u,sha}}}^{X}$ and $\mathbf{Z}_{\text{{u,sha}}}^{Y}$ , suggesting greater importance for injection into domain $X$ . Next, utilizing $p$ , we emphasize important and general components in $\mathbf{Z}_{\text{{u,sha}}}^{Y}$ and derive the commonality $c^{Y}$ :

c^{Y}=\mathbf{Z}_{\text{{u,sha}}}^{Y}\odot\operatorname{Norm}(p),

(4)

where $\odot$ signifies element-wise product, and $\operatorname{Norm}(\cdot)$ denotes normalization.

Finally, we derive the enhanced domain-shared representation $\mathbf{\hat{Z}}_{\text{{u,sha}}}^{X}$ of domain $X$ by integrating commonality in domain $Y$ :

\mathbf{\hat{Z}}_{\text{{u,sha}}}^{X}=\gamma_{s}\mathbf{Z}_{\text{{u,sha}}}^{X% }+(1-\gamma_{s})c^{Y},

(5)

where $\gamma_{s}$ is the weight score that balances the domain-shared user representation $\mathbf{Z}_{\text{{u,sha}}}^{X}$ in domain $X$ and commonality $c^{Y}$ in domain $Y$ .

4.3 Inversed Representation Learning Module

Building upon adaptively enhanced representations, we establish a unified framework for learning disentangled user preferences. Existing studies [2, 30] suffer from limitations in both self-supervised signal quality and fail to enforce domain-specific and domain-shared representations for encoding domain-dependent and domain-invariant information, respectively. To this end, we introduce the Inversed Representation Learning Module (IRLM). This module integrates a domain classifier and a gradient reversal layer to supervise the disentanglement process within a unified framework.

The domain classifier acts as a crucial tool that takes in user representations and predicts the domains from which they originate. For domain-specific user representations, we utilize them to encode domain-dependent information, allowing the domain classifier to readily discern the origin of inputs. The constraint is accomplished through the minimization of the loss function:

\mathcal{L}_{cls_{spe}}=\ell(DC(\mathbf{Z}_{\text{{u,spe}}}^{X}),O)+\ell(DC(% \mathbf{Z}_{\text{{u,spe}}}^{Y}),O),

(6)

where $l$ represents the cross-entropy loss, $O$ denotes the domain label ( $O=0$ for domain $X$ and $O=1$ otherwise), and $DC(\cdot)$ denotes the output of the domain classifier, which is implemented as a multi-layer perceptron (MLP).

Concerning domain-shared representations, we intend for them to encode domain-independent information and confuse the domain classifier. In other words, the objective of domain-shared user representations is to maximize the domain classification error, directly contrasting the goal of the domain classifier, which is to precisely categorize input representations. To learn these two inversed optimization objectives in a unified framework, we design a gradient reversal layer (GRL). The GRL automatically reverses the gradient direction during backpropagation and remains constant during forward propagation:

GRL(x)=x,\quad\frac{dGRL}{dx}=-\lambda I,

(7)

where $I$ denotes the unit matrix, and $\lambda$ is a dynamically adjusted parameter that balances the relationship between domain adaptation and classification accuracy.

Through the incorporation of GRL, we compel the domain-shared representations to encode invariant user preferences by minimizing the loss:

\mathcal{L}_{cls_{sha}}=\ell(DC(GRL(\mathbf{\hat{Z}}_{\text{{u,sha}}}^{X})),O)% +\ell(DC(GRL(\mathbf{\hat{Z}}_{\text{{u,sha}}}^{Y})),O).

(8)

The disentanglement of user representations is rigorously constrained by final classification loss $\mathcal{L}_{cls}$ , which arises from the fusion of domain-shared and domain-specific components:

\mathcal{L}_{cls}=\mathcal{L}_{cls_{sha}}+\mathcal{L}_{cls_{spe}}.

(9)

4.4 Prediction Layer and Multi-task Learning

Following the adaptive enhancement of representations and the inversed learning of disentangled user preferences, we can acquire the prediction scores $\hat{r}_{ui}^{\mathrm{X}}$ and $\hat{r}_{ui}^{\mathrm{Y}}$ :

\hat{r}_{ui}^{\mathrm{X}}={\mathbf{Z}_{u}^{X}}^{T}\mathbf{Z}_{i}^{X},\quad\hat% {r}_{ui}^{\mathrm{Y}}={\mathbf{Z}_{u}^{Y}}^{T}\mathbf{Z}_{i}^{Y}.

(10)

We then adopt a cross-entropy function to measure the model’s performance:

\mathcal{L}_{rec}=-\sum r_{ui}\log\hat{r}_{ui}+(1-r_{ui})\log(1-\hat{r}_{ui}).

(11)

To prevent overfitting, the regularization loss $\mathcal{L}_{reg}=\|\Theta\|_{2}^{2}$ is applied to the parameter set $\Theta$ . Consequently, the joint learning objective function of AREIL could be defined as the following multi-task learning framework:

\mathcal{L}=\mathcal{L}_{rec}+\lambda_{1}\mathcal{L}_{cls}+\lambda_{2}\mathcal% {L}_{reg},

(12)

where $\lambda_{1}$ and $\lambda_{2}$ are hyper-parameters balancing $\mathcal{L}_{cls}$ and $\mathcal{L}_{reg}$ .

5 Experiments

5.1 Experiments Settings

5.1.1 Datasets.

Table 1: Statistical information of experimental datasets.

Dataset	Domain	#Users	#Items	#Interaction	Density
Elec&Phone	Elec	3,325	17,709	52,966	0.089%
Elec&Phone	Phone	3,325	38,706	118,114	0.091%
Sport&Phone	Sport	4,998	20,845	54,256	0.052%
Sport&Phone	Phone	4,998	13,655	46,445	0.068%
Elec&Cloth	Elec	15,761	51,447	224,689	0.027%
Elec&Cloth	Cloth	15,761	48,781	133,609	0.017%

Table 2: The experimental results (%) for all models, including the Recall@20 and NDCG@20 metrics. The best results are highlighted in bold, while sub-optimal results are underlined. The last row indicates the percentage improvement in performance of our method compared to the best baseline. (p-value

<

0.05)

Domains	Elec		Phone		Sport		Phone		Elec		Cloth
Metrics@20	Recall	NDCG	Recall	NDCG	Recall	NDCG	Recall	NDCG	Recall	NDCG	Recall	NDCG
BPR	5.75	2.86	3.44	1.83	4.05	1.87	5.52	2.72	3.48	1.58	0.94	0.39
NGCF	7.27	3.40	3.89	2.26	4.54	2.28	7.04	3.29	3.80	1.77	1.52	0.66
LightGCN	\ul7.98	\ul3.69	4.09	2.34	5.15	\ul2.54	7.45	3.48	3.84	\ul1.82	\ul1.85	\ul0.84
DGCF	7.03	3.48	3.84	2.22	4.84	2.34	6.75	3.28	3.74	1.69	1.61	0.72
MultVAE	6.69	3.24	3.86	2.28	4.30	2.02	6.80	3.10	\ul3.97	1.79	1.48	0.61
CoNet	3.76	1.45	2.96	1.62	2.37	1.11	5.66	2.27	3.26	1.38	1.34	0.54
BiTGCF	7.32	3.46	\ul4.86	\ul2.79	\ul5.38	2.39	\ul7.83	\ul3.55	3.82	1.71	1.69	0.76
DRMTCDR	5.90	2.51	4.08	2.38	3.67	1.71	5.37	2.39	3.73	1.60	1.27	0.53
DisenCDR	5.27	2.12	3.85	2.18	3.60	1.59	6.41	2.67	3.35	1.42	1.39	0.55
AREIL	8.29	3.92	5.12	3.06	5.72	2.73	8.07	3.69	4.30	1.97	1.94	0.88
Improv.	3.88	6.23	5.35	9.68	6.32	7.48	3.07	3.94	8.31	8.24	4.86	4.76

Table 3: Ablation study (%) with key modules in DAFE-CDR.

Datasets	Sport&Phone				Elec&Cloth
Domain	Sport		Phone		Elec		Cloth
Metric	Recall	NDCG	Recall	NDCG	Recall	NDCG	Recall	NDCG
w/o graph	3.91	1.87	5.42	2.70	3.52	1.51	1.34	0.60
w/o AREM	5.17	2.27	7.95	3.63	4.11	1.92	1.91	0.86
w/o IRLM	5.27	2.31	7.87	3.61	4.01	1.88	1.76	0.80
AREIL	5.72	2.73	8.07	3.69	4.30	1.97	1.94	0.88

To evaluate the performance of AREIL, we conduct experiments on three real-world recommendation datasets from Amazon Datasets [6], which are widely used in cross-domain recommendation research [2, 16, 49] and are considered standard benchmarks. To ensure consistency in line with previous research studies, we apply a filtering process to retain only those users who exist in both domains simultaneously. This forms three distinct cross-domain scenarios: Elec&Phone, Sport&Phone, and Elec&Cloth. To convert user-item interactions into implicit data, we binarize the ratings as 0 or 1 to indicate the absence or presence of interactions. Detailed descriptions of these three cross-domain recommendation scenarios are provided in Table 1.

5.1.2 Compared Methods.

We compare AREIL with several classical state-of-the-art single-domain and cross-domain recommendation methods to demonstrate its effectiveness. The evaluated methods include:

$\bullet$

Single-domain approaches. (1) BPR [20] utilizes collaborative filtering techniques based on matrix factorization for personalized recommendation. (2) NGCF [28] utilizes graph convolutional networks to capture advanced collaborative signals, considering complex user-item interactions. (3) LightGCN [7] improves recommendation performance by simplifying the hierarchical structure of graph convolutional networks. (4) DGCF [29] employs dynamic graph convolutional networks and incorporates disentanglement via covariance regularization. (5) MultVAE [10] introduces hierarchical variational self-encoders to capture user preferences from historical interaction.
$\bullet$

Conventional Cross-domain methods. (1) CoNet [8] achieves deep bidirectional knowledge migration by adding bidirectional connections to a multilayer feed-forward network. (2) BiTGCF [16] realizes bidirectional knowledge transfer by exploiting the higher-order connectivity through a feature propagation layer, employing overlapped users as a bridge.
$\bullet$

Disentanglement-based Cross-domain methods. (1) DRMTCDR [3] disentangles user preferences into domain-shared and domain-specific components through graph contrastive learning. (2) DisenCDR [2] introduces two disentanglement regularizers based on mutual information to accomplish user representation disentanglement.

5.1.3 Evaluation Protocols.

To ensure fairness and efficiency, we employ a random partitioning strategy for each dataset, allocating 80 $\%$ to the training set, 10 $\%$ to the validation set, and 10 $\%$ to the test set. To avoid the sampling bias of the candidate selection, we adopt the whole item set as the candidate item set during evaluation [11]. In this investigation, we utilize Recall (Recall@20) and Normalized Discounted Cumulative Gain (NDCG@20) as metrics for evaluating the performance of top-K recommendations [7, 16].

5.1.4 Parameter Settings.

We implement the proposed AREIL method within the Recbole-CDR [46] framework. During training, we set the maximum number of training rounds to 1000 and employ an early-stop** strategy based on the value of NDCG@20. For a fair comparison, we adopt the same parameter settings. The dimensions of user and item embeddings are both set to 64, while the regularization weight $\beta$ is tuned within the range of $\{1e^{-3},1e^{-2},1e^{-1}\}$ . The learning rate is tuned within the range of $\{1e^{-4},1e^{-3},1e^{-2}\}$ , and the Adam optimizer is employed for parameter updates. For the GNN-based model, we explore the grid search range for the number of GNN layers, considering $\{2,3,4\}$ . Regarding loss weights $\lambda_{1}$ and $\lambda_{2}$ , we search the values from $\{1e^{-4},1e^{-3},1e^{-2},1e^{-1},1.0\}$ . Furthermore, for hyper-parameters $\gamma_{s}$ and $\gamma_{t}$ during the enhancement, we search within the range of $\{0.8,0.85,0.9,0.95\}$ .

5.2 Performance Comparison

Table 2 presents the experimental results of all models across the three datasets. Through the performance comparison, we can draw the following conclusions:

$\bullet$

Cross-domain recommendation methods generally outperform single-domain recommendation methods, especially in domains with sparse data. This suggests that CDR methods effectively transfer information across domains, mitigating the issue of data sparsity.
$\bullet$

GNN-based recommendation methods typically yield superior performance, affirming the capability to capture higher-order collaborative information and emphasizing the necessity of intra-domain representation enhancement.
$\bullet$

Disentanglement-based methods, such as DRMTCDR and DisenCDR, do not always achieve satisfactory recommendation performance on all datasets, which implies the need to develop rigorous disentanglement constraints.
$\bullet$

AREIL outperforms all baselines, demonstrating its effectiveness. Superiority over the GNN-based baseline confirms the value of adaptive inter-domain enhancement, while outperforming the disentanglement-based baseline validates the effectiveness of inversed learning.

5.3 Ablation Study

To validate the effectiveness of our proposed model, we conducted ablation experiments comparing AREIL with three variants: (1) w/o graph: replacing LightGCN in AREIL with matrix decomposition; (2) w/o AREM: removing the Inter-domain Adaptive Representation Enhancement Module by setting both $\gamma_{s}$ and $\gamma_{t}$ to 1; and (3) w/o IRLM: removing the Inversed Representation Learning Module by setting $\lambda_{1}$ to 0. The performance comparison between AREIL and the three variants is presented in Table 3.

The results indicate that all the variants demonstrate a significant decrease in performance compared to AREIL. (1) Among them, the variant w/o graph exhibits the most significant performance decline. This observation implies that enhancing embeddings with intra-domain higher-order collaborative signals is not only effective but also necessary. (2) The variant w/o AREM demonstrates a reduction, specifically a 16.85% decline in NDCG@20 within the Sport domain of the Sport&Phone dataset. This underscores the inadequacy of relying solely on domain classifiers to bridge dataset disparities, and the necessity for adaptive inter-domain enhancement. (3) The variant w/o IRLM consistently lags behind AREIL, exhibiting a decrease of 15.38% in the Sport domain. This emphasizes the imperative nature of inversed representation learning, a task readily achieved through harnessing the self-supervised signals provided by IRLM.

5.4 Impact of Hyper-parameter

5.4.1 Impact of the classification loss weight $\lambda_{1}$ .

The weight $\lambda_{1}$ of the classification loss plays a significant role in the model as it heavily influences the efficacy of inversed representation learning. Taking the Sport&Phone dataset as an example, we vary $\lambda_{1}$ within the range of $\{1e^{-4},\allowbreak 1e^{-3},\allowbreak 1e^{-2},\allowbreak 1e^{-1},% \allowbreak 1.0\}$ , with the remaining parameters held constant. The results, depicted in Fig. 2, indicate that the model demonstrates constrained sensitivity to the hyper-parameter $\lambda_{1}$ within a defined range. The best result is achieved with a moderate $\lambda_{1}$ because lower values of $\lambda_{1}$ lead to suboptimal disentanglement quality, while excessively large $\lambda_{1}$ values deviate from the intended recommendation task. It demonstrates the effectiveness of our introduced domain classifiers and gradient reverse layers in inversed representation learning.

5.4.2 Impact of the fusion controlling weight $\gamma_{s}$ .

We tune $\gamma_{s}$ to regulate inter-domain enhancement in the source domain, exploring values within $\{0.8,0.85,\allowbreak 0.9,0.95,1.0\}$ . The experimental results on the Sport&Phone dataset are presented in Fig. 3 and optimal performance occurs when $\gamma_{s}$ is intermediate. If $\gamma_{s}$ is excessively low, the model deviates from our assumption, overlooking that features in domain-shared embeddings may vary in importance and generality across different domains. With excessively high $\gamma_{s}$ , the model assigns low weights to the alternate domain, impeding information transfer. This demonstrates the necessity and effectiveness of the adaptive enhancement module we introduced. The trend of $\gamma_{t}$ exhibits a comparable pattern.

5.5 Representation Visualization

5.5.1 Analysis on the inversed representation learning.

To provide a more intuitive display, we randomly sample 1000 pairs of disentangled user representations within the Elec&Cloth domain, projecting them into a 2D space using t-SNE [17], as depicted in Fig. 4(a). We can observe a distinct separation between domain-shared and domain-specific embeddings, indicating successful disentanglement of user preferences. Meanwhile, domain-shared embeddings from diverse domains overlap, while domain-specific ones maintain separation, achieving corresponding information encoding. It demonstrates that our methods provide rigorous decoupling constraints and accomplish inversed representation learning.

5.5.2 Analysis on the necessity of adaptive enhancement.

In order to analyze the necessity of adaptive inter-domain enhancement, we randomly sample 10 pairs of domain-shared user embeddings and illustrate them in Fig. 4(b). Embeddings for the same user are represented by graphics with matching shapes, while distinct shapes denote different users. Notably, some embeddings demonstrate close proximity, suggesting a similarity in user preferences across domains. Meanwhile, certain pairs appear significant distance, implying disparate importance and generality in features present in both domains. Fig. 4(b) underscores the imperative need for achieving Adaptive Representation Enhancement.

6 Conclusion

This paper introduced a novel approach to dual-target cross-domain recommendation by focusing on adaptive representation enhancement and inversed representation learning. Specifically, we first disentangled mixed user preferences by dividing user representations into domain-shared and domain-specific components. To further improve the ability of user representation, we proposed an adaptive enhancement module that captured high-order information and revealed inter-domain correlations. Next, within a unified framework, we leveraged inversed constraints to learn truly disentangled user preferences. At last, we optimized the entire framework via multi-task learning. Extensive experiments demonstrate that AREIL significantly outperforms state-of-the-art baselines. In the future, we will explore incorporating additional attribute information for even more efficient inter-domain enhancement.

References

[1] Burgess, C.P., Higgins, I., Pal, A., Matthey, L., Watters, N., Desjardins, G., Lerchner, A.: Understanding disentangling in $\beta$ -vae. arXiv preprint arXiv:1804.03599 (2018)
[2] Cao, J., Lin, X., Cong, X., Ya, J., Liu, T., Wang, B.: Disencdr: Learning disentangled representations for cross-domain recommendation. In: SIGIR. pp. 267–277 (2022)
[3] Guo, X., Li, S., Guo, N., Cao, J., Liu, X., Ma, Q., Gan, R., Zhao, Y.: Disentangled representations learning for multi-target cross-domain recommendation. TOIS 41(4), 1–27 (2023)
[4] Han, Y., Wang, H., Wang, K., Wu, L., Li, Z., Guo, W., Liu, Y., Lian, D., Chen, E.: A survey on large language models for recommendation. arXiv preprint arXiv:2403.17603 (2023)
[5] Han, Y., Wang, H., Wang, K., Wu, L., Li, Z., Guo, W., Liu, Y., Lian, D., Chen, E.: End4rec: Efficient noise-decoupling for multi-behavior sequential recommendation. arXiv preprint arXiv:2403.17603 (2024)
[6] He, R., McAuley, J.: Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In: WWW. pp. 507–517 (2016)
[7] He, X., Deng, K., Wang, X., Li, Y., Zhang, Y., Wang, M.: Lightgcn: Simplifying and powering graph convolution network for recommendation. In: SIGIR. pp. 639–648 (2020)
[8] Hu, G., Zhang, Y., Yang, Q.: Conet: Collaborative cross networks for cross-domain recommendation. In: CIKM. pp. 667–676 (2018)
[9] Hu, G., Zhang, Y., Yang, Q.: Mtnet: a neural approach for cross-domain recommendation with unstructured text. KDD Deep Learning Day pp. 1–10 (2018)
[10] Kim, D., Suh, B.: Enhancing vaes for collaborative filtering: flexible priors & gating mechanisms. In: RecSys. pp. 403–407 (2019)
[11] Krichene, W., Rendle, S.: On sampled metrics for item recommendation. In: KDD. pp. 1748–1757 (2020)
[12] Kumar, A., Kumar, N., Hussain, M., Chaudhury, S., Agarwal, S.: Semantic clustering-based cross-domain recommendation. In: CIDM. pp. 137–141. IEEE (2014)
[13] Li, B., Yang, Q., Xue, X.: Can movies and books collaborate? cross-domain collaborative filtering for sparsity reduction. In: IJCAI (2009)
[14] Li, C., Xie, Y., Yu, C., Hu, B., Li, Z., Shu, G., Qie, X., Niu, D.: One for all, all for one: Learning and transferring user embeddings for cross-domain recommendation. In: WSDM. pp. 366–374 (2023)
[15] Li, P., Tuzhilin, A.: Ddtcdr: Deep dual transfer cross domain recommendation. In: WSDM. pp. 331–339 (2020)
[16] Liu, M., Li, J., Li, G., Pan, P.: Cross domain recommendation via bi-directional transfer graph collaborative filtering networks. In: CIKM. pp. 885–894 (2020)
[17] Van der Maaten, L., Hinton, G.: Visualizing data using t-sne. JMLR 9(11) (2008)
[18] Moreno, O., Shapira, B., Rokach, L., Shani, G.: Talmud: transfer learning for multiple domains. In: CIKM. pp. 425–434 (2012)
[19] Ning, W., Yan, X., Liu, W., Cheng, R., Zhang, R., Tang, B.: Multi-domain recommendation with embedding disentangling and domain alignment. In: CIKM. pp. 1917–1927 (2023)
[20] Rendle, S., Freudenthaler, C., Gantner, Z., Schmidt-Thieme, L.: Bpr: Bayesian personalized ranking from implicit feedback. arXiv preprint arXiv:1205.2618 (2012)
[21] Singh, A.P., Gordon, G.J.: Relational learning via collective matrix factorization. In: KDD. pp. 650–658 (2008)
[22] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. NeurIPS 30 (2017)
[23] Wang, C., Yu, Y., Ma, W., Zhang, M., Chen, C., Liu, Y., Ma, S.: Towards representation alignment and uniformity in collaborative filtering. In: KDD. pp. 1816–1825 (2022)
[24] Wang, H., Lian, D., Tong, H., Liu, Q., Huang, Z., Chen, E.: Decoupled representation learning for attributed networks. TKDE 35(3), 2430–2444 (2021)
[25] Wang, H., Lian, D., Tong, H., Liu, Q., Huang, Z., Chen, E.: Hypersorec: Exploiting hyperbolic user and item representations with multiple aspects for social-aware recommendation. TOIS 40(2), 1–28 (2021)
[26] Wang, H., Xu, T., Liu, Q., Lian, D., Chen, E., Du, D., Wu, H., Su, W.: Mcne: An end-to-end framework for learning multiple conditional network representations of social network. In: KDD. pp. 1064–1072 (2019)
[27] Wang, K., Zhu, Y., Liu, H., Zang, T., Wang, C., Liu, K.: Inter-and intra-domain relation-aware heterogeneous graph convolutional networks for cross-domain recommendation. In: DASFAA. pp. 53–68. Springer (2022)
[28] Wang, X., He, X., Wang, M., Feng, F., Chua, T.S.: Neural graph collaborative filtering. In: SIGIR. pp. 165–174 (2019)
[29] Wang, X., **, H., Zhang, A., He, X., Xu, T., Chua, T.S.: Disentangled graph collaborative filtering. In: SIGIR. pp. 1001–1010 (2020)
[30] Wang, X., Chen, H., Zhou, Y., Ma, J., Zhu, W.: Disentangled representation learning for recommendation. TPAMI 45(1), 408–424 (2022)
[31] Wang, Y., Li, Y., Li, S., Song, W., Fan, J., Gao, S., Ma, L., Cheng, B., Cai, X., Wang, S., et al.: Deep graph mutual learning for cross-domain recommendation. In: DASFAA. pp. 298–305. Springer (2022)
[32] Wu, J., Fan, W., Chen, J., Liu, S., Li, Q., Tang, K.: Disentangled contrastive learning for social recommendation. In: CIKM. pp. 4570–4574 (2022)
[33] Wu, L., Zheng, Z., Qiu, Z., Wang, H., Gu, H., Shen, T., Qin, C., Zhu, C., Zhu, H., Liu, Q., et al.: A survey on large language models for recommendation. arXiv preprint arXiv:2305.19860 (2023)
[34] Xiao, S., Zhu, D., Tang, C., Huang, Z.: Catcl: Joint cross-attention transfer and contrastive learning for cross-domain recommendation. In: DASFAA. pp. 446–461. Springer (2023)
[35] Xin, X., Liu, Z., Lin, C.Y., Huang, H., Wei, X., Guo, P.: Cross-domain collaborative filtering with review text. In: IJCAI (2015)
[36] Yi, Z., Ounis, I., Macdonald, C.: Contrastive graph prompt-tuning for cross-domain recommendation. TOIS (2023)
[37] Yin, M., Wang, H., Xu, X., Wu, L., Zhao, S., Guo, W., Liu, Y., Tang, R., Lian, D., Chen, E.: Apgl4sr: A generic framework with adaptive and personalized global collaborative information in sequential recommendation. In: CIKM. pp. 3009–3019 (2023)
[38] Yu, J., Yin, H., Xia, X., Chen, T., Cui, L., Nguyen, Q.V.H.: Are graph augmentations necessary? simple graph contrastive learning for recommendation. In: SIGIR. pp. 1294–1303 (2022)
[39] Yuan, F., Yao, L., Benatallah, B.: Darec: Deep domain adaptation for cross-domain recommendation via transferring rating patterns. arXiv preprint arXiv:1905.10760 (2019)
[40] Zang, T., Zhu, Y., Liu, H., Zhang, R., Yu, J.: A survey on cross-domain recommendation: taxonomies, methods, and future directions. TOIS 41(2), 1–39 (2022)
[41] Zhang, Q., Liao, W., Zhang, G., Yuan, B., Lu, J.: A deep dual adversarial network for cross-domain recommendation. TKDE (2021)
[42] Zhang, R., Zang, T., Zhu, Y., Wang, C., Wang, K., Yu, J.: Disentangled contrastive learning for cross-domain recommendation. In: DASFAA. pp. 163–178. Springer (2023)
[43] Zhang, Y., Zhang, Y., Guo, W., Cai, X., Yuan, X.: Learning disentangled representation for multimodal cross-domain sentiment analysis. TNNLS (2022)
[44] Zhang, Y., Chen, E., **, B., Wang, H., Hou, M., Huang, W., Yu, R.: Clustering based behavior sampling with long sequential data for ctr prediction. In: SIGIR. pp. 2195–2200 (2022)
[45] Zhao, C., Li, C., Fu, C.: Cross-domain recommendation via preference propagation graphnet. In: CIKM. pp. 2165–2168 (2019)
[46] Zhao, W.X., Mu, S., Hou, Y., Lin, Z., Chen, Y., Pan, X., Li, K., Lu, Y., Wang, H., Tian, C., et al.: Recbole: Towards a unified, comprehensive and efficient framework for recommendation algorithms. In: CIKM. pp. 4653–4664 (2021)
[47] Zhu, F., Wang, Y., Chen, C., Liu, G., Zheng, X.: A graphical and attentional framework for dual-target cross-domain recommendation. In: IJCAI. pp. 3001–3008 (2020)
[48] Zhu, F., Wang, Y., Chen, C., Zhou, J., Li, L., Liu, G.: Cross-domain recommendation: challenges, progress, and prospects. arXiv preprint arXiv:2103.01696 (2021)
[49] Zhu, J., Wang, Y., Zhu, F., Sun, Z.: Domain disentanglement with interpolative data augmentation for dual-target cross-domain recommendation. In: RecSys. pp. 515–527 (2023)