-
Better Generalization with Semantic IDs: A Case Study in Ranking for Recommendations
Authors:
Anima Singh,
Trung Vu,
Nikhil Mehta,
Raghunandan Keshavan,
Maheswaran Sathiamoorthy,
Yilin Zheng,
Lichan Hong,
Lukasz Heldt,
Li Wei,
Devansh Tandon,
Ed H. Chi,
Xinyang Yi
Abstract:
Randomly-hashed item ids are used ubiquitously in recommendation models. However, the learned representations from random hashing prevents generalization across similar items, causing problems of learning unseen and long-tail items, especially when item corpus is large, power-law distributed, and evolving dynamically. In this paper, we propose using content-derived features as a replacement for ra…
▽ More
Randomly-hashed item ids are used ubiquitously in recommendation models. However, the learned representations from random hashing prevents generalization across similar items, causing problems of learning unseen and long-tail items, especially when item corpus is large, power-law distributed, and evolving dynamically. In this paper, we propose using content-derived features as a replacement for random ids. We show that simply replacing ID features with content-based embeddings can cause a drop in quality due to reduced memorization capability. To strike a good balance of memorization and generalization, we propose to use Semantic IDs -- a compact discrete item representation learned from frozen content embeddings using RQ-VAE that captures the hierarchy of concepts in items -- as a replacement for random item ids. Similar to content embeddings, the compactness of Semantic IDs poses a problem of easy adaption in recommendation models. We propose novel methods for adapting Semantic IDs in industry-scale ranking models, through hashing sub-pieces of of the Semantic-ID sequences. In particular, we find that the SentencePiece model that is commonly used in LLM tokenization outperforms manually crafted pieces such as N-grams. To the end, we evaluate our approaches in a real-world ranking model for YouTube recommendations. Our experiments demonstrate that Semantic IDs can replace the direct use of video IDs by improving the generalization ability on new and long-tail item slices without sacrificing overall model quality.
△ Less
Submitted 30 May, 2024; v1 submitted 13 June, 2023;
originally announced June 2023.
-
What can a GNOME do? Search targets for the Global Network of Optical Magnetometers for Exotic physics searches
Authors:
S. Afach,
D. Aybas Tumturk,
H. Bekker,
B. C. Buchler,
D. Budker,
K. Cervantes,
A. Derevianko,
J. Eby,
N. L. Figueroa,
R. Folman,
D. Gavil'an Martin,
M. Givon,
Z. D. Grujic,
H. Guo,
P. Hamilton,
M. P. Hedges,
D. F. Jackson Kimball,
S. Khamis,
D. Kim,
E. Klinger,
A. Kryemadhi,
X. Liu,
G. Lukasiewicz,
H. Masia-Roig,
M. Padniuk
, et al. (28 additional authors not shown)
Abstract:
Numerous observations suggest that there exist undiscovered beyond-the-Standard-Model particles and fields. Because of their unknown nature, these exotic particles and fields could interact with Standard Model particles in many different ways and assume a variety of possible configurations. Here we present an overview of the Global Network of Optical Magnetometers for Exotic physics searches (GNOM…
▽ More
Numerous observations suggest that there exist undiscovered beyond-the-Standard-Model particles and fields. Because of their unknown nature, these exotic particles and fields could interact with Standard Model particles in many different ways and assume a variety of possible configurations. Here we present an overview of the Global Network of Optical Magnetometers for Exotic physics searches (GNOME), our ongoing experimental program designed to test a wide range of exotic physics scenarios. The GNOME experiment utilizes a worldwide network of shielded atomic magnetometers (and, more recently, comagnetometers) to search for spatially and temporally correlated signals due to torques on atomic spins from exotic fields of astrophysical origin. We survey the temporal characteristics of a variety of possible signals currently under investigation such as those from topological defect dark matter (axion-like particle domain walls), axion-like particle stars, solitons of complex-valued scalar fields (Q-balls), stochastic fluctuations of bosonic dark matter fields, a solar axion-like particle halo, and bursts of ultralight bosonic fields produced by cataclysmic astrophysical events such as binary black hole mergers.
△ Less
Submitted 4 May, 2023; v1 submitted 2 May, 2023;
originally announced May 2023.
-
Search for topological defect dark matter with a global network of optical magnetometers
Authors:
Samer Afach,
Ben C. Buchler,
Dmitry Budker,
Conner Dailey,
Andrei Derevianko,
Vincent Dumont,
Nataniel L. Figueroa,
Ilja Gerhardt,
Zoran D. Grujić,
Hong Guo,
Chuanpeng Hao,
Paul S. Hamilton,
Morgan Hedges,
Derek F. Jackson Kimball,
Dongok Kim,
Sami Khamis,
Thomas Kornack,
Victor Lebedev,
Zheng-Tian Lu,
Hector Masia-Roig,
Madeline Monroy,
Mikhail Padniuk,
Christopher A. Palm,
Sun Yool Park,
Karun V. Paul
, et al. (24 additional authors not shown)
Abstract:
Ultralight bosons such as axion-like particles are viable candidates for dark matter. They can form stable, macroscopic field configurations in the form of topological defects that could concentrate the dark matter density into many distinct, compact spatial regions that are small compared to the galaxy but much larger than the Earth. Here, we report the results of a search for transient signals f…
▽ More
Ultralight bosons such as axion-like particles are viable candidates for dark matter. They can form stable, macroscopic field configurations in the form of topological defects that could concentrate the dark matter density into many distinct, compact spatial regions that are small compared to the galaxy but much larger than the Earth. Here, we report the results of a search for transient signals from axion-like particle domain walls with the Global Network of Optical Magnetometers for Exotic physics searches (GNOME). We search the data, consisting of correlated measurements from optical atomic magnetometers located in laboratories all over the world, for patterns of signals propagating through the network consistent with domain walls. The analysis of data from a continuous month-long operation of the GNOME finds no statistically significant signals, thus placing experimental constraints on such dark matter scenarios.
△ Less
Submitted 7 December, 2021; v1 submitted 26 February, 2021;
originally announced February 2021.