Search | arXiv e-print repository

CEEBERT: Cross-Domain Inference in Early Exit BERT

Authors: Divya Jyoti Bajpai, Manjesh Kumar Hanawal

Abstract: Pre-trained Language Models (PLMs), like BERT, with self-supervision objectives exhibit remarkable performance and generalization across various tasks. However, they suffer in inference latency due to their large size. To address this issue, side branches are attached at intermediate layers, enabling early inference of samples without requiring them to pass through all layers. However, the challen… ▽ More Pre-trained Language Models (PLMs), like BERT, with self-supervision objectives exhibit remarkable performance and generalization across various tasks. However, they suffer in inference latency due to their large size. To address this issue, side branches are attached at intermediate layers, enabling early inference of samples without requiring them to pass through all layers. However, the challenge is to decide which layer to infer and exit each sample so that the accuracy and latency are balanced. Moreover, the distribution of the samples to be inferred may differ from that used for training necessitating cross-domain adaptation. We propose an online learning algorithm named Cross-Domain Inference in Early Exit BERT (CeeBERT) that dynamically determines early exits of samples based on the level of confidence at each exit point. CeeBERT learns optimal thresholds from domain-specific confidence observed at intermediate layers on the fly, eliminating the need for labeled data. Experimental results on five distinct datasets with BERT and ALBERT models demonstrate CeeBERT's ability to improve latency by reducing unnecessary computations with minimal drop in performance. By adapting to the threshold values, CeeBERT can speed up the BERT/ALBERT models by $2\times$ - $3.5\times$ with minimal drop in accuracy. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: Accepted at ACL 2024

arXiv:2404.12686 [pdf, other]

The MONUMENT Experiment: Ordinary Muon Capture studies for 0$νββ$ decay

Authors: Dhanurdhar Bajpai, Laura Baudis, Viacheslav Belov, Elisabetta Bossio, Thomas E. Cocolios, Hiroyasu Ejiri, Evgenii Sushenok, Maria Fomina, Izyan H. Hashim, Michael Heines, Konstantin Gusev, Sergej Kazartsev, Andreas Knecht, Elizabeth Mondragon, Ng Zheng Wei, Faiznur Othman, Igor Ostrovskiy, Gabriela R. Araujo, Nadyia Rumyantseva, Mario Schwarz, Stefan Schoenert, Mark Shirchenko, Egor Shevchik, Yury Shitov, Jouni Suhonen , et al. (4 additional authors not shown)

Abstract: The MONUMENT experiment measures ordinary muon capture (OMC) on isotopes relevant for neutrinoless double-beta (0$νββ$) decay and nuclear astrophysics. OMC is a particularly attractive tool for improving the theoretical description of 0$νββ$ decay. It involves similar momentum transfers and allows testing the virtual transitions involved in 0$νββ$ decay against experimental data. During the 2021 c… ▽ More The MONUMENT experiment measures ordinary muon capture (OMC) on isotopes relevant for neutrinoless double-beta (0$νββ$) decay and nuclear astrophysics. OMC is a particularly attractive tool for improving the theoretical description of 0$νββ$ decay. It involves similar momentum transfers and allows testing the virtual transitions involved in 0$νββ$ decay against experimental data. During the 2021 campaign, MONUMENT measured OMC on $^{76}$Se and $^{136}$Ba, the isotopes relevant for next-generation 0$νββ$ decay searches, like LEGEND and nEXO. The experimental setup has been designed to accurately extract the total and partial muon capture rates, which requires precise reconstruction of energies and time-dependent intensities of the OMC-related $γ$ rays. The setup also includes a veto counter system to allow selecting a clean sample of OMC events. This work provides a detailed description of the MONUMENT setup operated during the 2021 campaign, its two DAQ systems, calibration and analysis approaches, and summarises the achieved detector performance. Future improvements are also discussed. △ Less

Submitted 19 April, 2024; originally announced April 2024.

Comments: 18 pages, 20 figures, submitted to EPJC

arXiv:2402.15472 [pdf, other]

FAIR: Filtering of Automatically Induced Rules

Authors: Divya Jyoti Bajpai, Ayush Maheshwari, Manjesh Kumar Hanawal, Ganesh Ramakrishnan

Abstract: The availability of large annotated data can be a critical bottleneck in training machine learning algorithms successfully, especially when applied to diverse domains. Weak supervision offers a promising alternative by accelerating the creation of labeled training data using domain-specific rules. However, it requires users to write a diverse set of high-quality rules to assign labels to the unlab… ▽ More The availability of large annotated data can be a critical bottleneck in training machine learning algorithms successfully, especially when applied to diverse domains. Weak supervision offers a promising alternative by accelerating the creation of labeled training data using domain-specific rules. However, it requires users to write a diverse set of high-quality rules to assign labels to the unlabeled data. Automatic Rule Induction (ARI) approaches circumvent this problem by automatically creating rules from features on a small labeled set and filtering a final set of rules from them. In the ARI approach, the crucial step is to filter out a set of a high-quality useful subset of rules from the large set of automatically created rules. In this paper, we propose an algorithm (Filtering of Automatically Induced Rules) to filter rules from a large number of automatically induced rules using submodular objective functions that account for the collective precision, coverage, and conflicts of the rule set. We experiment with three ARI approaches and five text classification datasets to validate the superior performance of our algorithm with respect to several semi-supervised label aggregation approaches. Further, we show that achieves statistically significant results in comparison to existing rule-filtering approaches. △ Less

Submitted 23 February, 2024; originally announced February 2024.

Comments: Published at EACL 2024

arXiv:2402.15177 [pdf, other]

doi 10.1088/1748-0221/19/06/P06008

Validation of the VUV-reflective coating for next-generation liquid xenon detectors

Authors: D. Bajpai, A. Best, I. Ostrovskiy, D. Poitras, W. Wang

Abstract: Coating detector materials with films highly reflective in the ultraviolet region improves sensitivity of the rare-event detectors that use liquid xenon. In this work, we investigate the MgF$_2$-Al-MgF$_2$ coating designed to achieve high reflectance at 175 nm, the mean wavelength of liquid xenon (LXe) scintillation. The coating was applied to an unpolished, passivated copper substrate mimicking a… ▽ More Coating detector materials with films highly reflective in the ultraviolet region improves sensitivity of the rare-event detectors that use liquid xenon. In this work, we investigate the MgF$_2$-Al-MgF$_2$ coating designed to achieve high reflectance at 175 nm, the mean wavelength of liquid xenon (LXe) scintillation. The coating was applied to an unpolished, passivated copper substrate mimicking a realistic detector component of the proposed nEXO experiment, as well as to two unpassivated substrates with "high" and "average" levels of polishing. After confirming the composition and morphology of the thin-film coating using TEM and EDS, the samples underwent reflectance measurements in LXe and gaseous nitrogen (GN2). Measurements in LXe exposed the coated samples to -100 $°$C for several hours. No peeling of the coatings was observed after several thermal cycles. Polishing is found to strongly correlate with the measured specular reflectance ($R_{\mathrm{spec}}$). In particular, 5.8(5)% specular spike reflectance in LXe was measured for the realistic sample at 20$°$ of incidence, while the values for similar angles of incidence on the high and average polish samples are 62.3(1.3)% and 27.4(7)%, respectively. At large angles (66°--75$°$), the $R_{\mathrm{spec}}$ in LXe for the three samples increases to 23(5)%, 80(8)%, and 84(18)%, respectively. The $R_{\mathrm{spec}}$ at around 45$°$ was measured in both GN2 and LXe for average polish sample and shows a reasonable agreement. Importantly, the total reflectance of the samples is comparable and estimated to be 92(8)%, 85(8)%, and 83(8)% in GN2 for the realistic, average, and high polish samples, respectively. This is considered satisfactory for the next-generation LXe experiments that could benefit from using reflective films, such as nEXO and DARWIN, thus validating the design of the coating. △ Less

Submitted 14 May, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

Comments: As accepted to JINST

Journal ref: 2024_JINST_19_P06008

arXiv:2401.10541 [pdf, other]

I-SplitEE: Image classification in Split Computing DNNs with Early Exits

Authors: Divya Jyoti Bajpai, Aastha Jaiswal, Manjesh Kumar Hanawal

Abstract: The recent advances in Deep Neural Networks (DNNs) stem from their exceptional performance across various domains. However, their inherent large size hinders deploying these networks on resource-constrained devices like edge, mobile, and IoT platforms. Strategies have emerged, from partial cloud computation offloading (split computing) to integrating early exits within DNN layers. Our work present… ▽ More The recent advances in Deep Neural Networks (DNNs) stem from their exceptional performance across various domains. However, their inherent large size hinders deploying these networks on resource-constrained devices like edge, mobile, and IoT platforms. Strategies have emerged, from partial cloud computation offloading (split computing) to integrating early exits within DNN layers. Our work presents an innovative unified approach merging early exits and split computing. We determine the 'splitting layer', the optimal depth in the DNN for edge device computations, and whether to infer on edge device or be offloaded to the cloud for inference considering accuracy, computational efficiency, and communication costs. Also, Image classification faces diverse environmental distortions, influenced by factors like time of day, lighting, and weather. To adapt to these distortions, we introduce I-SplitEE, an online unsupervised algorithm ideal for scenarios lacking ground truths and with sequential data. Experimental validation using Caltech-256 and Cifar-10 datasets subjected to varied distortions showcases I-SplitEE's ability to reduce costs by a minimum of 55% with marginal performance degradation of at most 5%. △ Less

Submitted 19 January, 2024; originally announced January 2024.

Comments: To appear in proceedings of IEEE International Conference on Communications 2024

arXiv:2309.09195 [pdf, other]

SplitEE: Early Exit in Deep Neural Networks with Split Computing

Authors: Divya J. Bajpai, Vivek K. Trivedi, Sohan L. Yadav, Manjesh K. Hanawal

Abstract: Deep Neural Networks (DNNs) have drawn attention because of their outstanding performance on various tasks. However, deploying full-fledged DNNs in resource-constrained devices (edge, mobile, IoT) is difficult due to their large size. To overcome the issue, various approaches are considered, like offloading part of the computation to the cloud for final inference (split computing) or performing th… ▽ More Deep Neural Networks (DNNs) have drawn attention because of their outstanding performance on various tasks. However, deploying full-fledged DNNs in resource-constrained devices (edge, mobile, IoT) is difficult due to their large size. To overcome the issue, various approaches are considered, like offloading part of the computation to the cloud for final inference (split computing) or performing the inference at an intermediary layer without passing through all layers (early exits). In this work, we propose combining both approaches by using early exits in split computing. In our approach, we decide up to what depth of DNNs computation to perform on the device (splitting layer) and whether a sample can exit from this layer or need to be offloaded. The decisions are based on a weighted combination of accuracy, computational, and communication costs. We develop an algorithm named SplitEE to learn an optimal policy. Since pre-trained DNNs are often deployed in new domains where the ground truths may be unavailable and samples arrive in a streaming fashion, SplitEE works in an online and unsupervised setup. We extensively perform experiments on five different datasets. SplitEE achieves a significant cost reduction ($>50\%$) with a slight drop in accuracy ($<2\%$) as compared to the case when all samples are inferred at the final layer. The anonymized source code is available at \url{https://anonymous.4open.science/r/SplitEE_M-B989/README.md}. △ Less

Submitted 17 September, 2023; originally announced September 2023.

Comments: 10 pages, to appear in the proceeding AIMLSystems 2023

arXiv:2306.16340 [pdf, other]

Cosmogenic background simulations for the DARWIN observatory at different underground locations

Authors: M. Adrover, L. Althueser, B. Andrieu, E. Angelino, J. R. Angevaare, B. Antunovic, E. Aprile, M. Babicz, D. Bajpai, E. Barberio, L. Baudis, M. Bazyk, N. Bell, L. Bellagamba, R. Biondi, Y. Biondi, A. Bismark, C. Boehm, A. Breskin, E. J. Brookes, A. Brown, G. Bruno, R. Budnik, C. Capelli, J. M. R. Cardoso , et al. (158 additional authors not shown)

Abstract: Xenon dual-phase time projections chambers (TPCs) have proven to be a successful technology in studying physical phenomena that require low-background conditions. With 40t of liquid xenon (LXe) in the TPC baseline design, DARWIN will have a high sensitivity for the detection of particle dark matter, neutrinoless double beta decay ($0νββ$), and axion-like particles (ALPs). Although cosmic muons are… ▽ More Xenon dual-phase time projections chambers (TPCs) have proven to be a successful technology in studying physical phenomena that require low-background conditions. With 40t of liquid xenon (LXe) in the TPC baseline design, DARWIN will have a high sensitivity for the detection of particle dark matter, neutrinoless double beta decay ($0νββ$), and axion-like particles (ALPs). Although cosmic muons are a source of background that cannot be entirely eliminated, they may be greatly diminished by placing the detector deep underground. In this study, we used Monte Carlo simulations to model the cosmogenic background expected for the DARWIN observatory at four underground laboratories: Laboratori Nazionali del Gran Sasso (LNGS), Sanford Underground Research Facility (SURF), Laboratoire Souterrain de Modane (LSM) and SNOLAB. We determine the production rates of unstable xenon isotopes and tritium due to muon-included neutron fluxes and muon-induced spallation. These are expected to represent the dominant contributions to cosmogenic backgrounds and thus the most relevant for site selection. △ Less

Submitted 28 June, 2023; originally announced June 2023.

arXiv:2203.14354 [pdf, other]

doi 10.1088/1748-0221/17/07/P07018

GPU-based optical simulation of the DARWIN detector

Authors: L. Althueser, B. Antunović, E. Aprile, D. Bajpai, L. Baudis, D. Baur, A. L. Baxter, L. Bellagamba, R. Biondi, Y. Biondi, A. Bismark, A. Brown, R. Budnik, A. Chauvin, A. P. Colijn, J. J. Cuenca-García, V. D'Andrea, P. Di Gangi, J. Dierle, S. Diglio, M. Doerenkamp, K. Eitel, S. Farrell, A. D. Ferella, C. Ferrari , et al. (55 additional authors not shown)

Abstract: Understanding propagation of scintillation light is critical for maximizing the discovery potential of next-generation liquid xenon detectors that use dual-phase time projection chamber technology. This work describes a detailed optical simulation of the DARWIN detector implemented using Chroma, a GPU-based photon tracking framework. To evaluate the framework and to explore ways of maximizing effi… ▽ More Understanding propagation of scintillation light is critical for maximizing the discovery potential of next-generation liquid xenon detectors that use dual-phase time projection chamber technology. This work describes a detailed optical simulation of the DARWIN detector implemented using Chroma, a GPU-based photon tracking framework. To evaluate the framework and to explore ways of maximizing efficiency and minimizing the time of light collection, we simulate several variations of the conventional detector design. Results of these selected studies are presented. More generally, we conclude that the approach used in this work allows one to investigate alternative designs faster and in more detail than using conventional Geant4 optical simulations, making it an attractive tool to guide the development of the ultimate liquid xenon observatory. △ Less

Submitted 11 July, 2022; v1 submitted 27 March, 2022; originally announced March 2022.

Comments: Updated to address the referees' comments, add few more authors. Journal reference added

Journal ref: JINST 17 (2022) P07018

arXiv:2203.02309 [pdf, other]

doi 10.1088/1361-6471/ac841a

A Next-Generation Liquid Xenon Observatory for Dark Matter and Neutrino Physics

Authors: J. Aalbers, K. Abe, V. Aerne, F. Agostini, S. Ahmed Maouloud, D. S. Akerib, D. Yu. Akimov, J. Akshat, A. K. Al Musalhi, F. Alder, S. K. Alsum, L. Althueser, C. S. Amarasinghe, F. D. Amaro, A. Ames, T. J. Anderson, B. Andrieu, N. Angelides, E. Angelino, J. Angevaare, V. C. Antochi, D. Antón Martin, B. Antunovic, E. Aprile, H. M. Araújo , et al. (572 additional authors not shown)

Abstract: The nature of dark matter and properties of neutrinos are among the most pressing issues in contemporary particle physics. The dual-phase xenon time-projection chamber is the leading technology to cover the available parameter space for Weakly Interacting Massive Particles (WIMPs), while featuring extensive sensitivity to many alternative dark matter candidates. These detectors can also study neut… ▽ More The nature of dark matter and properties of neutrinos are among the most pressing issues in contemporary particle physics. The dual-phase xenon time-projection chamber is the leading technology to cover the available parameter space for Weakly Interacting Massive Particles (WIMPs), while featuring extensive sensitivity to many alternative dark matter candidates. These detectors can also study neutrinos through neutrinoless double-beta decay and through a variety of astrophysical sources. A next-generation xenon-based detector will therefore be a true multi-purpose observatory to significantly advance particle physics, nuclear physics, astrophysics, solar physics, and cosmology. This review article presents the science cases for such a detector. △ Less

Submitted 4 March, 2022; originally announced March 2022.

Comments: 77 pages, 40 figures, 1262 references

Report number: INT-PUB-22-003

Journal ref: J. Phys. G: Nucl. Part. Phys. 50 (2023) 013001

arXiv:2107.01098 [pdf, other]

Temporal Analysis of Worldwide War

Authors: Devansh Bajpai, Rishi Ranjan Singh

Abstract: Analysis of wars and conflicts between regions has been an important topic of interest throughout the history of humankind. In the latter part of the 20th century, in the aftermath of two World Wars and the shadow of nuclear, biological, and chemical holocaust, more was written on the subject than ever before. Wars have a negative impact on a country's economy, social order, infrastructure, and pu… ▽ More Analysis of wars and conflicts between regions has been an important topic of interest throughout the history of humankind. In the latter part of the 20th century, in the aftermath of two World Wars and the shadow of nuclear, biological, and chemical holocaust, more was written on the subject than ever before. Wars have a negative impact on a country's economy, social order, infrastructure, and public health. In this paper, we study the wars fought in history and draw conclusions from that. We explore the participation of countries in wars and the nature of relationships between various countries during different timelines. A big part of today's wars is fought against terrorism. Therefore, this study also attempts to shed light on different countries' exposure to terrorist encounters and analyses the impact of wars on a country's economy in terms of change in GDP. △ Less

Submitted 27 June, 2021; originally announced July 2021.

arXiv:1508.06865 [pdf, ps, other]

Anonymity in Predicting the Future

Authors: Dvij Bajpai, Daniel J. Velleman

Abstract: Consider an arbitrary set $S$ and an arbitrary function $f : \mathbb{R} \to S$. We think of the domain of $f$ as representing time, and for each $x \in \mathbb{R}$, we think of $f(x)$ as the state of some system at time $x$. Imagine that, at each time $x$, there is an agent who can see $f \upharpoonright (-\infty, x)$ and is trying to guess $f(x)$--in other words, the agent is trying to guess the… ▽ More Consider an arbitrary set $S$ and an arbitrary function $f : \mathbb{R} \to S$. We think of the domain of $f$ as representing time, and for each $x \in \mathbb{R}$, we think of $f(x)$ as the state of some system at time $x$. Imagine that, at each time $x$, there is an agent who can see $f \upharpoonright (-\infty, x)$ and is trying to guess $f(x)$--in other words, the agent is trying to guess the present state of the system from its past history. In a 2008 paper, Christopher Hardin and Alan Taylor use the axiom of choice to construct a strategy that the agents can use to guarantee that, for every function $f$, all but countably many of them will guess correctly. In a 2013 monograph they introduce the idea of anonymous guessing strategies, in which the agents can see the past but don't know where they are located in time. In this paper we consider a number of variations on anonymity. For instance, what if, in addition to not knowing where they are located in time, agents also do not know the rate at which time is progressing? What if they have no sense of how much time elapses between any two events? We show that in some cases agents can still guess successfully, while in others they perform very poorly. △ Less

Submitted 27 August, 2015; originally announced August 2015.

Comments: 12 pages, 1 figure

MSC Class: 03E05 (Primary); 03E25 (Secondary)

arXiv:1410.0591 [pdf, ps, other]

Non-archimedean connected Julia sets with branching

Authors: Dvij Bajpai, Robert L. Benedetto, Ruqian Chen, Edward Kim, Owen Marschall, Darius Onul, Yang Xiao

Abstract: We construct the first examples of rational functions defined over a non-archimedean field with certain dynamical properties. In particular, we find such functions whose Julia sets, in the Berkovich projective line, are connected but not contained in a line segment. We also show how to compute the measure-theoretic and topological entropy of such maps. In particular, we show for some of our exampl… ▽ More We construct the first examples of rational functions defined over a non-archimedean field with certain dynamical properties. In particular, we find such functions whose Julia sets, in the Berkovich projective line, are connected but not contained in a line segment. We also show how to compute the measure-theoretic and topological entropy of such maps. In particular, we show for some of our examples that the measure-theoretic entropy is strictly smaller than the topological entropy, thus answering a question of Favre and Rivera-Letelier. △ Less

Submitted 6 April, 2015; v1 submitted 2 October, 2014; originally announced October 2014.

Comments: Minor revisions, including simplified computation of Gurevich entropy

MSC Class: 37P40

Showing 1–12 of 12 results for author: Bajpai, D