MemoNet: Memorizing All Cross Features' Representations Efficiently via Multi-Hash Codebook Network for CTR Prediction

Zhang, Pengtao; Zhang, Junlin

doi:10.1145/3583780.3614963

Computer Science > Information Retrieval

arXiv:2211.01334 (cs)

[Submitted on 25 Oct 2022 (v1), last revised 4 Sep 2023 (this version, v3)]

Title:MemoNet: Memorizing All Cross Features' Representations Efficiently via Multi-Hash Codebook Network for CTR Prediction

Authors:Pengtao Zhang, Junlin Zhang

View PDF

Abstract:New findings in natural language processing (NLP) demonstrate that the strong memorization capability contributes a lot to the success of Large Language Models (LLM). This inspires us to explicitly bring an independent memory mechanism into CTR ranking model to learn and memorize cross features' representations. In this paper, we propose multi-Hash Codebook NETwork (HCNet) as the memory mechanism for efficiently learning and memorizing representations of cross features in CTR tasks. HCNet uses a multi-hash codebook as the main memory place and the whole memory procedure consists of three phases: multi-hash addressing, memory restoring, and feature shrinking. We also propose a new CTR model named MemoNet which combines HCNet with a DNN backbone. Extensive experimental results on three public datasets and online test show that MemoNet reaches superior performance over state-of-the-art approaches. Besides, MemoNet shows scaling law of large language model in NLP, which means we can enlarge the size of the codebook in HCNet to sustainably obtain performance gains. Our work demonstrates the importance and feasibility of learning and memorizing representations of cross features, which sheds light on a new promising research direction.

Subjects:	Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2211.01334 [cs.IR]
	(or arXiv:2211.01334v3 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.2211.01334
Journal reference:	ACM International Conference on Information and Knowledge Management(CIKM '23), October 21-25,2023,Birmingham,United Kingdom
Related DOI:	https://doi.org/10.1145/3583780.3614963

Submission history

From: Zhang Junlin [view email]
[v1] Tue, 25 Oct 2022 12:08:14 UTC (2,237 KB)
[v2] Thu, 3 Nov 2022 06:49:56 UTC (1,319 KB)
[v3] Mon, 4 Sep 2023 08:58:04 UTC (1,314 KB)

Computer Science > Information Retrieval

Title:MemoNet: Memorizing All Cross Features' Representations Efficiently via Multi-Hash Codebook Network for CTR Prediction

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title:MemoNet: Memorizing All Cross Features' Representations Efficiently via Multi-Hash Codebook Network for CTR Prediction

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators