-
Scaling the Vocabulary of Non-autoregressive Models for Efficient Generative Retrieval
Authors:
Ravisri Valluri,
Akash Kumar Mohankumar,
Kushal Dave,
Amit Singh,
Jian Jiao,
Manik Varma,
Gaurav Sinha
Abstract:
Generative Retrieval introduces a new approach to Information Retrieval by reframing it as a constrained generation task, leveraging recent advancements in Autoregressive (AR) language models. However, AR-based Generative Retrieval methods suffer from high inference latency and cost compared to traditional dense retrieval techniques, limiting their practical applicability. This paper investigates…
▽ More
Generative Retrieval introduces a new approach to Information Retrieval by reframing it as a constrained generation task, leveraging recent advancements in Autoregressive (AR) language models. However, AR-based Generative Retrieval methods suffer from high inference latency and cost compared to traditional dense retrieval techniques, limiting their practical applicability. This paper investigates fully Non-autoregressive (NAR) language models as a more efficient alternative for generative retrieval. While standard NAR models alleviate latency and cost concerns, they exhibit a significant drop in retrieval performance (compared to AR models) due to their inability to capture dependencies between target tokens. To address this, we question the conventional choice of limiting the target token space to solely words or sub-words. We propose PIXAR, a novel approach that expands the target vocabulary of NAR models to include multi-word entities and common phrases (up to 5 million tokens), thereby reducing token dependencies. PIXAR employs inference optimization strategies to maintain low inference latency despite the significantly larger vocabulary. Our results demonstrate that PIXAR achieves a relative improvement of 31.0% in MRR@10 on MS MARCO and 23.2% in Hits@5 on Natural Questions compared to standard NAR models with similar latency and cost. Furthermore, online A/B experiments on a large commercial search engine show that PIXAR increases ad clicks by 5.08% and revenue by 4.02%.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Foundational Verification of Smart Contracts through Verified Compilation
Authors:
Vilhelm Sjöberg,
Kinnari Dave,
Daniel Britten,
Maria A Schett,
Xinyuan Sun,
Qinshi Wang,
Sean Noble Anderson,
Steve Reeves,
Zhong Shao
Abstract:
Programs executed on a blockchain - smart contracts - have high financial stakes; their correctness is crucial. We argue, that this correctness needs to be foundational: correctness needs to be based on the operational semantics of their execution environment. In this work we present a foundational system - the DeepSEA system - targeting the Ethereum blockchain as the largest smart contract platfo…
▽ More
Programs executed on a blockchain - smart contracts - have high financial stakes; their correctness is crucial. We argue, that this correctness needs to be foundational: correctness needs to be based on the operational semantics of their execution environment. In this work we present a foundational system - the DeepSEA system - targeting the Ethereum blockchain as the largest smart contract platform. The DeepSEA system has a small but sufficiently rich programming language amenable for verification, the DeepSEA language, and a verified DeepSEA compiler. Together they enable true end-to-end verification for smart contracts. We demonstrate usability through two case studies: a realistic contract for Decentralized Finance and contract for crowdfunding.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
Metalloporphyrins on Oxygen-Passivated Iron: Conformation and Order Beyond the First Layer
Authors:
David Maximilian Janas,
Andreas Windischbacher,
Mira Sophie Arndt,
Michael Gutnikov,
Lasse Sternemann,
David Gutnikov,
Till Willershausen,
Jonah Elias Nitschke,
Karl Schiller,
Daniel Baranowski,
Vitaliy Feyer,
Iulia Cojocariu,
Khush Dave,
Peter Puschnig,
Matija Stupar,
Stefano Ponzoni,
Mirko Cinchetti,
Giovanni Zamborlini
Abstract:
On-surface metal porphyrins can undergo electronic and conformational changes that play a crucial role in determining the chemical reactivity of the molecular layer. Therefore, understanding those properties is pivotal for the design and implementation of organic-based devices. Here, by means of photoemission orbital tomography supported by density functional theory calculations, we investigate th…
▽ More
On-surface metal porphyrins can undergo electronic and conformational changes that play a crucial role in determining the chemical reactivity of the molecular layer. Therefore, understanding those properties is pivotal for the design and implementation of organic-based devices. Here, by means of photoemission orbital tomography supported by density functional theory calculations, we investigate the electronic and geometrical structure of two metallated tetraphenyl porphyrins (MTPPs), namely ZnTPP and NiTPP, adsorbed on the oxygen-passivated Fe(100)-p(1x1)O surface. Both molecules weakly interact with the surface as no charge transfer is observed. In the case of ZnTPP our data correspond to those of moderately distorted molecules, while NiTPP exhibits a severe saddle-shape deformation. From additional experiments on NiTPP multilayer films, we conclude that this distortion is a consequence of the interaction with the substrate, as the NiTPP macrocycle of the second layer turns out to be flat. We further find that distortions in the MTPP macrocycle are accompanied by an increasing energy gap between the highest occupied molecular orbitals (HOMO and HOMO-1). Our results demonstrate that photoemission orbital tomography can simultaneously probe the energy level alignment, the azimuthal orientation, and the adsorption geometry of complex aromatic molecules even in the multilayer regime.
△ Less
Submitted 3 August, 2023; v1 submitted 26 April, 2023;
originally announced April 2023.
-
NGAME: Negative Mining-aware Mini-batching for Extreme Classification
Authors:
Kunal Dahiya,
Nilesh Gupta,
Deepak Saini,
Akshay Soni,
Yajun Wang,
Kushal Dave,
Jian Jiao,
Gururaj K,
Prasenjit Dey,
Amit Singh,
Deepesh Hada,
Vidit Jain,
Bhawna Paliwal,
Anshul Mittal,
Sonu Mehta,
Ramachandran Ramjee,
Sumeet Agarwal,
Purushottam Kar,
Manik Varma
Abstract:
Extreme Classification (XC) seeks to tag data points with the most relevant subset of labels from an extremely large label set. Performing deep XC with dense, learnt representations for data points and labels has attracted much attention due to its superiority over earlier XC methods that used sparse, hand-crafted features. Negative mining techniques have emerged as a critical component of all dee…
▽ More
Extreme Classification (XC) seeks to tag data points with the most relevant subset of labels from an extremely large label set. Performing deep XC with dense, learnt representations for data points and labels has attracted much attention due to its superiority over earlier XC methods that used sparse, hand-crafted features. Negative mining techniques have emerged as a critical component of all deep XC methods that allow them to scale to millions of labels. However, despite recent advances, training deep XC models with large encoder architectures such as transformers remains challenging. This paper identifies that memory overheads of popular negative mining techniques often force mini-batch sizes to remain small and slow training down. In response, this paper introduces NGAME, a light-weight mini-batch creation technique that offers provably accurate in-batch negative samples. This allows training with larger mini-batches offering significantly faster convergence and higher accuracies than existing negative sampling techniques. NGAME was found to be up to 16% more accurate than state-of-the-art methods on a wide array of benchmark datasets for extreme classification, as well as 3% more accurate at retrieving search engine queries in response to a user webpage visit to show personalized ads. In live A/B tests on a popular search engine, NGAME yielded up to 23% gains in click-through-rates.
△ Less
Submitted 10 July, 2022;
originally announced July 2022.
-
DeepXML: A Deep Extreme Multi-Label Learning Framework Applied to Short Text Documents
Authors:
Kunal Dahiya,
Deepak Saini,
Anshul Mittal,
Ankush Shaw,
Kushal Dave,
Akshay Soni,
Himanshu Jain,
Sumeet Agarwal,
Manik Varma
Abstract:
Scalability and accuracy are well recognized challenges in deep extreme multi-label learning where the objective is to train architectures for automatically annotating a data point with the most relevant subset of labels from an extremely large label set. This paper develops the DeepXML framework that addresses these challenges by decomposing the deep extreme multi-label task into four simpler sub…
▽ More
Scalability and accuracy are well recognized challenges in deep extreme multi-label learning where the objective is to train architectures for automatically annotating a data point with the most relevant subset of labels from an extremely large label set. This paper develops the DeepXML framework that addresses these challenges by decomposing the deep extreme multi-label task into four simpler sub-tasks each of which can be trained accurately and efficiently. Choosing different components for the four sub-tasks allows DeepXML to generate a family of algorithms with varying trade-offs between accuracy and scalability. In particular, DeepXML yields the Astec algorithm that could be 2-12% more accurate and 5-30x faster to train than leading deep extreme classifiers on publically available short text datasets. Astec could also efficiently train on Bing short text datasets containing up to 62 million labels while making predictions for billions of users and data points per day on commodity hardware. This allowed Astec to be deployed on the Bing search engine for a number of short text applications ranging from matching user queries to advertiser bid phrases to showing personalized ads where it yielded significant gains in click-through-rates, coverage, revenue and other online metrics over state-of-the-art techniques currently in production. DeepXML's code is available at https://github.com/Extreme-classification/deepxml
△ Less
Submitted 12 November, 2021;
originally announced November 2021.
-
Study of Interplanetary and Geomagnetic Response of Filament Associated CMEs
Authors:
Kunjal Dave,
Wageesh Mishra,
Nandita Srivastava,
R. M. Jadhav
Abstract:
It has been established that Coronal Mass Ejections (CMEs) may have significant impact on terrestrial magnetic field and lead to space weather events. In the present study, we selected several CMEs which are associated with filament eruptions on the Sun. We attempt to identify the presence of filament material within ICME at 1AU. We discuss how different ICMEs associated with filaments lead to mod…
▽ More
It has been established that Coronal Mass Ejections (CMEs) may have significant impact on terrestrial magnetic field and lead to space weather events. In the present study, we selected several CMEs which are associated with filament eruptions on the Sun. We attempt to identify the presence of filament material within ICME at 1AU. We discuss how different ICMEs associated with filaments lead to moderate or major geomagnetic activity on their arrival at the Earth. Our study also highlights the difficulties in identifying the filament material at 1AU within isolated and in interacting CMEs.
△ Less
Submitted 30 June, 2018;
originally announced July 2018.
-
Absence of Luttinger's Theorem due to Zeros in the Single-Particle Green Function
Authors:
Kiaran B. Dave,
Philip W. Phillips,
Charles L. Kane
Abstract:
We show exactly with an SU(N) interacting model that even if the ambiguity associated with the placement of the chemical potential, $μ$, for a T=0 gapped system is removed by using the unique value $μ(T\rightarrow 0)$, Luttinger's sum rule is violated even if the ground-state degeneracy is lifted by an infinitesimal hop**. The failure stems from the non-existence of the Luttinger-Ward functional…
▽ More
We show exactly with an SU(N) interacting model that even if the ambiguity associated with the placement of the chemical potential, $μ$, for a T=0 gapped system is removed by using the unique value $μ(T\rightarrow 0)$, Luttinger's sum rule is violated even if the ground-state degeneracy is lifted by an infinitesimal hop**. The failure stems from the non-existence of the Luttinger-Ward functional for a system in which the self-energy diverges. Since it is the existence of the Luttinger-Ward functional that is the basis for Luttinger's theorem which relates the charge density to sign changes of the single-particle Green function, no such theorem exists. Experimental data on the cuprates are presented which show a systematic deviation from the Luttinger count, implying a breakdown of the electron quasiparticle picture in strongly correlated electron matter.
△ Less
Submitted 5 March, 2013; v1 submitted 17 July, 2012;
originally announced July 2012.
-
Universal features of Thermopower in High Tc systems and Quantum Criticality
Authors:
Arti Garg,
B. Sriram Shastry,
Kiaran B. Dave,
Philip Phillips
Abstract:
In high Tc superconductors a wide ranging connection between the do** dependence of the transition temperature Tc and the room temperature thermopower Q has been observed. A "universal correlation" between these two quantities exists with the thermopower vanishing at optimum do** as noted by OCTHH (Obertelli, Cooper, Tallon, Honma and Hor). In this work we provide an interpretation of this OCT…
▽ More
In high Tc superconductors a wide ranging connection between the do** dependence of the transition temperature Tc and the room temperature thermopower Q has been observed. A "universal correlation" between these two quantities exists with the thermopower vanishing at optimum do** as noted by OCTHH (Obertelli, Cooper, Tallon, Honma and Hor). In this work we provide an interpretation of this OCTHH universality in terms of a possible underlying quantum critical point (QCP) at Tc. Central to our viewpoint is the recently noted Kelvin formula relating the thermopower to the density derivative of the entropy. Perspective on this formula is gained through a model calculation of the various Kubo formulas in an exactly solved 1-dimensional model with various limiting procedures of wave vector and frequency.
△ Less
Submitted 13 April, 2011;
originally announced April 2011.