-
MultiFusionNet: Multilayer Multimodal Fusion of Deep Neural Networks for Chest X-Ray Image Classification
Authors:
Saurabh Agarwal,
K. V. Arya,
Yogesh Kumar Meena
Abstract:
Chest X-ray imaging is a critical diagnostic tool for identifying pulmonary diseases. However, manual interpretation of these images is time-consuming and error-prone. Automated systems utilizing convolutional neural networks (CNNs) have shown promise in improving the accuracy and efficiency of chest X-ray image classification. While previous work has mainly focused on using feature maps from the…
▽ More
Chest X-ray imaging is a critical diagnostic tool for identifying pulmonary diseases. However, manual interpretation of these images is time-consuming and error-prone. Automated systems utilizing convolutional neural networks (CNNs) have shown promise in improving the accuracy and efficiency of chest X-ray image classification. While previous work has mainly focused on using feature maps from the final convolution layer, there is a need to explore the benefits of leveraging additional layers for improved disease classification. Extracting robust features from limited medical image datasets remains a critical challenge. In this paper, we propose a novel deep learning-based multilayer multimodal fusion model that emphasizes extracting features from different layers and fusing them. Our disease detection model considers the discriminatory information captured by each layer. Furthermore, we propose the fusion of different-sized feature maps (FDSFM) module to effectively merge feature maps from diverse layers. The proposed model achieves a significantly higher accuracy of 97.21% and 99.60% for both three-class and two-class classifications, respectively. The proposed multilayer multimodal fusion model, along with the FDSFM module, holds promise for accurate disease classification and can also be extended to other disease classifications in chest X-ray images.
△ Less
Submitted 1 January, 2024;
originally announced January 2024.
-
Blox: A Modular Toolkit for Deep Learning Schedulers
Authors:
Saurabh Agarwal,
Amar Phanishayee,
Shivaram Venkataraman
Abstract:
Deep Learning (DL) workloads have rapidly increased in popularity in enterprise clusters and several new cluster schedulers have been proposed in recent years to support these workloads. With rapidly evolving DL workloads, it is challenging to quickly prototype and compare scheduling policies across workloads. Further, as prior systems target different aspects of scheduling (resource allocation, p…
▽ More
Deep Learning (DL) workloads have rapidly increased in popularity in enterprise clusters and several new cluster schedulers have been proposed in recent years to support these workloads. With rapidly evolving DL workloads, it is challenging to quickly prototype and compare scheduling policies across workloads. Further, as prior systems target different aspects of scheduling (resource allocation, placement, elasticity etc.), it is also challenging to combine these techniques and understand the overall benefits. To address these challenges we propose Blox, a modular toolkit which allows developers to compose individual components and realize diverse scheduling frameworks. We identify a set of core abstractions for DL scheduling, implement several existing schedulers using these abstractions, and verify the fidelity of these implementations by reproducing results from prior research. We also highlight how we can evaluate and compare existing schedulers in new settings: different workload traces, higher cluster load, change in DNN workloads and deployment characteristics. Finally, we showcase Blox's extensibility by composing policies from different schedulers, and implementing novel policies with minimal code changes. Blox is available at \url{https://github.com/msr-fiddle/blox}.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
StarVector: Generating Scalable Vector Graphics Code from Images
Authors:
Juan A. Rodriguez,
Shubham Agarwal,
Issam H. Laradji,
Pau Rodriguez,
David Vazquez,
Christopher Pal,
Marco Pedersoli
Abstract:
Scalable Vector Graphics (SVGs) have become integral in modern image rendering applications due to their infinite scalability in resolution, versatile usability, and editing capabilities. SVGs are particularly popular in the fields of web development and graphic design. Existing approaches for SVG modeling using deep learning often struggle with generating complex SVGs and are restricted to simple…
▽ More
Scalable Vector Graphics (SVGs) have become integral in modern image rendering applications due to their infinite scalability in resolution, versatile usability, and editing capabilities. SVGs are particularly popular in the fields of web development and graphic design. Existing approaches for SVG modeling using deep learning often struggle with generating complex SVGs and are restricted to simpler ones that require extensive processing and simplification. This paper introduces StarVector, a multimodal SVG generation model that effectively integrates Code Generation Large Language Models (CodeLLMs) and vision models. Our approach utilizes a CLIP image encoder to extract visual representations from pixel-based images, which are then transformed into visual tokens via an adapter module. These visual tokens are pre-pended to the SVG token embeddings, and the sequence is modeled by the StarCoder model using next-token prediction, effectively learning to align the visual and code tokens. This enables StarVector to generate unrestricted SVGs that accurately represent pixel images. To evaluate StarVector's performance, we present SVG-Bench, a comprehensive benchmark for evaluating SVG methods across multiple datasets and relevant metrics. Within this benchmark, we introduce novel datasets including SVG-Stack, a large-scale dataset of real-world SVG examples, and use it to pre-train StarVector as a large foundation model for SVGs. Our results demonstrate significant enhancements in visual quality and complexity handling over current methods, marking a notable advancement in SVG generation technology. Code and models: https://github.com/joanrod/star-vector
△ Less
Submitted 17 December, 2023;
originally announced December 2023.
-
VecFusion: Vector Font Generation with Diffusion
Authors:
Vikas Thamizharasan,
Difan Liu,
Shantanu Agarwal,
Matthew Fisher,
Michael Gharbi,
Oliver Wang,
Alec Jacobson,
Evangelos Kalogerakis
Abstract:
We present VecFusion, a new neural architecture that can generate vector fonts with varying topological structures and precise control point positions. Our approach is a cascaded diffusion model which consists of a raster diffusion model followed by a vector diffusion model. The raster model generates low-resolution, rasterized fonts with auxiliary control point information, capturing the global s…
▽ More
We present VecFusion, a new neural architecture that can generate vector fonts with varying topological structures and precise control point positions. Our approach is a cascaded diffusion model which consists of a raster diffusion model followed by a vector diffusion model. The raster model generates low-resolution, rasterized fonts with auxiliary control point information, capturing the global style and shape of the font, while the vector model synthesizes vector fonts conditioned on the low-resolution raster fonts from the first stage. To synthesize long and complex curves, our vector diffusion model uses a transformer architecture and a novel vector representation that enables the modeling of diverse vector geometry and the precise prediction of control points. Our experiments show that, in contrast to previous generative models for vector graphics, our new cascaded vector diffusion model generates higher quality vector fonts, with complex structures and diverse styles.
△ Less
Submitted 21 May, 2024; v1 submitted 16 December, 2023;
originally announced December 2023.
-
Radioisotopes production using lasers: from basic science to applications
Authors:
M. R. D. Rodrigues,
A. Bonasera,
M. Scisciò,
J. A. Pérez-Hernández,
M. Ehret,
F. Filippi,
P. L. Andreoli,
M. Huault,
H. Larreur,
D. Singappuli,
D. Molloy,
D. Raffestin,
M. Alonzo,
G. G. Rapisarda,
D. Lattuada,
G. L. Guardo,
C. Verona,
Fe. Consoli,
G. Petringa,
A. McNamee,
M. La Cognata,
S. Palmerini,
T. Carriere,
M. Cipriani,
G. Di Giorgio
, et al. (15 additional authors not shown)
Abstract:
Laser technologies improved after the understanding of the Chirped Pulse Amplification (CPA) which allows energetic laser beams to be compressed to tens of femtosecond (fs) pulse durations and focused to few $μ$m. Protons of tens of MeV can be accelerated using for instance the Target Normal Sheath Acceleration (TNSA) method and focused on secondary targets. In such conditions, nuclear reactions c…
▽ More
Laser technologies improved after the understanding of the Chirped Pulse Amplification (CPA) which allows energetic laser beams to be compressed to tens of femtosecond (fs) pulse durations and focused to few $μ$m. Protons of tens of MeV can be accelerated using for instance the Target Normal Sheath Acceleration (TNSA) method and focused on secondary targets. In such conditions, nuclear reactions can occur and radioisotopes relevant for medical purposes be produced. High repetition lasers can be used to produce enough isotopes for medical applications. This route is competitive to conventional methods mostly based on accelerators. In this paper we study the production of $^{67}$Cu, $^{63}$Zn, $^{18}$F and $^{11}$C currently used in positron emission tomography (PET) and other applications. At the same time, we study the reaction $^{10}$B(p,$α$)$^{7}$Be and $^{70}$Zn(p,4n)$^{67}$Ga to put further constraints to the proton distributions at different angles and to the reaction $^{11}$B(p,$α$)$^{8}$Be relevant for energy production. The experiment was performed at the 1 petawatt (PW) laser facility at Vega III located in Salamanca-Spain. Angular distributions of radioisotopes in the forward (with respect to the laser direction) and backward directions were measured using a High Purity Germanium Detector (HPGE). Our results are reasonably reproduced by the numerical estimates following the approach of Kimura et al. (NIMA637(2011)167)
△ Less
Submitted 14 December, 2023;
originally announced December 2023.
-
Approximate Caching for Efficiently Serving Diffusion Models
Authors:
Shubham Agarwal,
Subrata Mitra,
Sarthak Chakraborty,
Srikrishna Karanam,
Koyel Mukherjee,
Shiv Saini
Abstract:
Text-to-image generation using diffusion models has seen explosive popularity owing to their ability in producing high quality images adhering to text prompts. However, production-grade diffusion model serving is a resource intensive task that not only require high-end GPUs which are expensive but also incurs considerable latency. In this paper, we introduce a technique called approximate-caching…
▽ More
Text-to-image generation using diffusion models has seen explosive popularity owing to their ability in producing high quality images adhering to text prompts. However, production-grade diffusion model serving is a resource intensive task that not only require high-end GPUs which are expensive but also incurs considerable latency. In this paper, we introduce a technique called approximate-caching that can reduce such iterative denoising steps for an image generation based on a prompt by reusing intermediate noise states created during a prior image generation for similar prompts. Based on this idea, we present an end to end text-to-image system, Nirvana, that uses the approximate-caching with a novel cache management-policy Least Computationally Beneficial and Frequently Used (LCBFU) to provide % GPU compute savings, 19.8% end-to-end latency reduction and 19% dollar savings, on average, on two real production workloads. We further present an extensive characterization of real production text-to-image prompts from the perspective of caching, popularity and reuse of intermediate states in a large production environment.
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
PEFTDebias : Capturing debiasing information using PEFTs
Authors:
Sumit Agarwal,
Aditya Srikanth Veerubhotla,
Srijan Bansal
Abstract:
The increasing use of foundation models highlights the urgent need to address and eliminate implicit biases present in them that arise during pretraining. In this paper, we introduce PEFTDebias, a novel approach that employs parameter-efficient fine-tuning (PEFT) to mitigate the biases within foundation models. PEFTDebias consists of two main phases: an upstream phase for acquiring debiasing param…
▽ More
The increasing use of foundation models highlights the urgent need to address and eliminate implicit biases present in them that arise during pretraining. In this paper, we introduce PEFTDebias, a novel approach that employs parameter-efficient fine-tuning (PEFT) to mitigate the biases within foundation models. PEFTDebias consists of two main phases: an upstream phase for acquiring debiasing parameters along a specific bias axis, and a downstream phase where these parameters are incorporated into the model and frozen during the fine-tuning process. By evaluating on four datasets across two bias axes namely gender and race, we find that downstream biases can be effectively reduced with PEFTs. In addition, we show that these parameters possess axis-specific debiasing characteristics, enabling their effective transferability in mitigating biases in various downstream tasks. To ensure reproducibility, we release the code to do our experiments.
△ Less
Submitted 1 December, 2023;
originally announced December 2023.
-
TrustMark: Universal Watermarking for Arbitrary Resolution Images
Authors:
Tu Bui,
Shruti Agarwal,
John Collomosse
Abstract:
Imperceptible digital watermarking is important in copyright protection, misinformation prevention, and responsible generative AI. We propose TrustMark - a GAN-based watermarking method with novel design in architecture and spatio-spectra losses to balance the trade-off between watermarked image quality with the watermark recovery accuracy. Our model is trained with robustness in mind, withstandin…
▽ More
Imperceptible digital watermarking is important in copyright protection, misinformation prevention, and responsible generative AI. We propose TrustMark - a GAN-based watermarking method with novel design in architecture and spatio-spectra losses to balance the trade-off between watermarked image quality with the watermark recovery accuracy. Our model is trained with robustness in mind, withstanding various in- and out-place perturbations on the encoded image. Additionally, we introduce TrustMark-RM - a watermark remover method useful for re-watermarking. Our methods achieve state-of-art performance on 3 benchmarks comprising arbitrary resolution images.
△ Less
Submitted 30 November, 2023;
originally announced November 2023.
-
Orca 2: Teaching Small Language Models How to Reason
Authors:
Arindam Mitra,
Luciano Del Corro,
Shweti Mahajan,
Andres Codas,
Clarisse Simoes,
Sahaj Agarwal,
Xuxi Chen,
Anastasia Razdaibiedina,
Erik Jones,
Kriti Aggarwal,
Hamid Palangi,
Guoqing Zheng,
Corby Rosset,
Hamed Khanpour,
Ahmed Awadallah
Abstract:
Orca 1 learns from rich signals, such as explanation traces, allowing it to outperform conventional instruction-tuned models on benchmarks like BigBench Hard and AGIEval. In Orca 2, we continue exploring how improved training signals can enhance smaller LMs' reasoning abilities. Research on training small LMs has often relied on imitation learning to replicate the output of more capable models. We…
▽ More
Orca 1 learns from rich signals, such as explanation traces, allowing it to outperform conventional instruction-tuned models on benchmarks like BigBench Hard and AGIEval. In Orca 2, we continue exploring how improved training signals can enhance smaller LMs' reasoning abilities. Research on training small LMs has often relied on imitation learning to replicate the output of more capable models. We contend that excessive emphasis on imitation may restrict the potential of smaller models. We seek to teach small LMs to employ different solution strategies for different tasks, potentially different from the one used by the larger model. For example, while larger models might provide a direct answer to a complex task, smaller models may not have the same capacity. In Orca 2, we teach the model various reasoning techniques (step-by-step, recall then generate, recall-reason-generate, direct answer, etc.). More crucially, we aim to help the model learn to determine the most effective solution strategy for each task. We evaluate Orca 2 using a comprehensive set of 15 diverse benchmarks (corresponding to approximately 100 tasks and over 36,000 unique prompts). Orca 2 significantly surpasses models of similar size and attains performance levels similar or better to those of models 5-10x larger, as assessed on complex tasks that test advanced reasoning abilities in zero-shot settings. make Orca 2 weights publicly available at aka.ms/orca-lm to support research on the development, evaluation, and alignment of smaller LMs
△ Less
Submitted 21 November, 2023; v1 submitted 18 November, 2023;
originally announced November 2023.
-
Representing visual classification as a linear combination of words
Authors:
Shobhit Agarwal,
Yevgeniy R. Semenov,
William Lotter
Abstract:
Explainability is a longstanding challenge in deep learning, especially in high-stakes domains like healthcare. Common explainability methods highlight image regions that drive an AI model's decision. Humans, however, heavily rely on language to convey explanations of not only "where" but "what". Additionally, most explainability approaches focus on explaining individual AI predictions, rather tha…
▽ More
Explainability is a longstanding challenge in deep learning, especially in high-stakes domains like healthcare. Common explainability methods highlight image regions that drive an AI model's decision. Humans, however, heavily rely on language to convey explanations of not only "where" but "what". Additionally, most explainability approaches focus on explaining individual AI predictions, rather than describing the features used by an AI model in general. The latter would be especially useful for model and dataset auditing, and potentially even knowledge generation as AI is increasingly being used in novel tasks. Here, we present an explainability strategy that uses a vision-language model to identify language-based descriptors of a visual classification task. By leveraging a pre-trained joint embedding space between images and text, our approach estimates a new classification task as a linear combination of words, resulting in a weight for each word that indicates its alignment with the vision-based classifier. We assess our approach using two medical imaging classification tasks, where we find that the resulting descriptors largely align with clinical knowledge despite a lack of domain-specific language training. However, our approach also identifies the potential for 'shortcut connections' in the public datasets used. Towards a functional measure of explainability, we perform a pilot reader study where we find that the AI-identified words can enable non-expert humans to perform a specialized medical task at a non-trivial level. Altogether, our results emphasize the potential of using multimodal foundational models to deliver intuitive, language-based explanations of visual tasks.
△ Less
Submitted 17 November, 2023;
originally announced November 2023.
-
Control of the Purcell effect via unexcited atoms and exceptional points
Authors:
G. S. Agarwal
Abstract:
We examine the possible control of the celebrated Purcell effect in cavity quantum electrodynamics. We demonstrate that the presence of an unexcited atom can significantly alter the Purcell decay depending on the strength of coupling of the unexcited atom with the cavity mode though the excited atom has to be weakly coupled for it to be in the Purcell regime. This is distinct from the nonradiative…
▽ More
We examine the possible control of the celebrated Purcell effect in cavity quantum electrodynamics. We demonstrate that the presence of an unexcited atom can significantly alter the Purcell decay depending on the strength of coupling of the unexcited atom with the cavity mode though the excited atom has to be weakly coupled for it to be in the Purcell regime. This is distinct from the nonradiative nature of the singlet state which is an entangled state of the two atom system. We present physical interpretation for inhibition as due to interference between two polariton channels of decay. We bring out connection to exceptional points in the cavity QED system as the unexcited atom and cavity mode can produce a second order exceptional point. We further show how two unexcited atoms can create a third order exceptional point leading to inhibition of Purcell effect. We also discuss the case when the Purcell effect can be enhanced.
△ Less
Submitted 17 November, 2023;
originally announced November 2023.
-
A Decision Support System for Liver Diseases Prediction: Integrating Batch Processing, Rule-Based Event Detection and SPARQL Query
Authors:
Ritesh Chandra,
Sadhana Tiwari,
Satyam Rastogi,
Sonali Agarwal
Abstract:
Liver diseases pose a significant global health burden, impacting a substantial number of individuals and exerting substantial economic and social consequences. Rising liver problems are considered a fatal disease in many countries, such as Egypt, Molda, etc. The objective of this study is to construct a predictive model for liver illness using Basic Formal Ontology (BFO) and detection rules deriv…
▽ More
Liver diseases pose a significant global health burden, impacting a substantial number of individuals and exerting substantial economic and social consequences. Rising liver problems are considered a fatal disease in many countries, such as Egypt, Molda, etc. The objective of this study is to construct a predictive model for liver illness using Basic Formal Ontology (BFO) and detection rules derived from a decision tree algorithm. Based on these rules, events are detected through batch processing using the Apache Jena framework. Based on the event detected, queries can be directly processed using SPARQL. To make the ontology operational, these Decision Tree (DT) rules are converted into Semantic Web Rule Language (SWRL). Using this SWRL in the ontology for predicting different types of liver disease with the help of the Pellet and Drool inference engines in Protege Tools, a total of 615 records are taken from different liver diseases. After inferring the rules, the result can be generated for the patient according to the DT rules, and other patient-related details along with different precautionary suggestions can be obtained based on these results. Combining query results of batch processing and ontology-generated results can give more accurate suggestions for disease prevention and detection. This work aims to provide a comprehensive approach that is applicable for liver disease prediction, rich knowledge graph representation, and smart querying capabilities. The results show that combining RDF data, SWRL rules, and SPARQL queries for analysing and predicting liver disease can help medical professionals to learn more about liver diseases and make a Decision Support System (DSS) for health care.
△ Less
Submitted 10 November, 2023;
originally announced November 2023.
-
An Evaluation of Forensic Facial Recognition
Authors:
Justin Norman,
Shruti Agarwal,
Hany Farid
Abstract:
Recent advances in machine learning and computer vision have led to reported facial recognition accuracies surpassing human performance. We question if these systems will translate to real-world forensic scenarios in which a potentially low-resolution, low-quality, partially-occluded image is compared against a standard facial database. We describe the construction of a large-scale synthetic facia…
▽ More
Recent advances in machine learning and computer vision have led to reported facial recognition accuracies surpassing human performance. We question if these systems will translate to real-world forensic scenarios in which a potentially low-resolution, low-quality, partially-occluded image is compared against a standard facial database. We describe the construction of a large-scale synthetic facial dataset along with a controlled facial forensic lineup, the combination of which allows for a controlled evaluation of facial recognition under a range of real-world conditions. Using this synthetic dataset, and a popular dataset of real faces, we evaluate the accuracy of two popular neural-based recognition systems. We find that previously reported face recognition accuracies of more than 95% drop to as low as 65% in this more challenging forensic scenario.
△ Less
Submitted 10 November, 2023;
originally announced November 2023.
-
TokenMotion: Motion-Guided Vision Transformer for Video Camouflaged Object Detection Via Learnable Token Selection
Authors:
Zifan Yu,
Erfan Bank Tavakoli,
Meida Chen,
Suya You,
Raghuveer Rao,
Sanjeev Agarwal,
Fengbo Ren
Abstract:
The area of Video Camouflaged Object Detection (VCOD) presents unique challenges in the field of computer vision due to texture similarities between target objects and their surroundings, as well as irregular motion patterns caused by both objects and camera movement. In this paper, we introduce TokenMotion (TMNet), which employs a transformer-based model to enhance VCOD by extracting motion-guide…
▽ More
The area of Video Camouflaged Object Detection (VCOD) presents unique challenges in the field of computer vision due to texture similarities between target objects and their surroundings, as well as irregular motion patterns caused by both objects and camera movement. In this paper, we introduce TokenMotion (TMNet), which employs a transformer-based model to enhance VCOD by extracting motion-guided features using a learnable token selection. Evaluated on the challenging MoCA-Mask dataset, TMNet achieves state-of-the-art performance in VCOD. It outperforms the existing state-of-the-art method by a 12.8% improvement in weighted F-measure, an 8.4% enhancement in S-measure, and a 10.7% boost in mean IoU. The results demonstrate the benefits of utilizing motion-guided features via learnable token selection within a transformer-based framework to tackle the intricate task of VCOD.
△ Less
Submitted 1 February, 2024; v1 submitted 4 November, 2023;
originally announced November 2023.
-
Entanglement and coherence in pure and doped Posner molecules
Authors:
Betony Adams,
Ilya Sinayskiy,
Shivang Agarwal,
Francesco Petruccione
Abstract:
The potential role of spin in biological systems is a primary topic in quantum biology. However, much of this research focuses on electron spin. A recent hypothesis suggests that nuclear spin may be better suited to biological processes, being less sensitive to decoherence. The hypothesis details how phosphorus nuclei might be prepared in a spin-entangled state, how this entanglement is protected…
▽ More
The potential role of spin in biological systems is a primary topic in quantum biology. However, much of this research focuses on electron spin. A recent hypothesis suggests that nuclear spin may be better suited to biological processes, being less sensitive to decoherence. The hypothesis details how phosphorus nuclei might be prepared in a spin-entangled state, how this entanglement is protected by assembly into calcium phosphate (Posner) molecules, and how this entanglement might modulate calcium ion production and concomitant neural activation. In this paper, we investigate the robustness of quantum effects such as coherence and entanglement in Posner molecules. We investigate how these effects are directly dependent on specific parameters such as spin-spin coupling strengths and Posner molecule symmetry. We also investigate how lithium isotope-doped Posner molecules differentially modulate quantum resources such as coherence and entanglement and whether this is a viable explanation for lithium's mechanism of action in bipolar disease. Finally, we illustrate how entanglement might possibly be preserved through the exploitation of the biological environment.
△ Less
Submitted 20 October, 2023;
originally announced October 2023.
-
Domain-specific optimization and diverse evaluation of self-supervised models for histopathology
Authors:
Jeremy Lai,
Faruk Ahmed,
Supriya Vijay,
Tiam Jaroensri,
Jessica Loo,
Saurabh Vyawahare,
Saloni Agarwal,
Fayaz Jamil,
Yossi Matias,
Greg S. Corrado,
Dale R. Webster,
Jonathan Krause,
Yun Liu,
Po-Hsuan Cameron Chen,
Ellery Wulczyn,
David F. Steiner
Abstract:
Task-specific deep learning models in histopathology offer promising opportunities for improving diagnosis, clinical research, and precision medicine. However, development of such models is often limited by availability of high-quality data. Foundation models in histopathology that learn general representations across a wide range of tissue types, diagnoses, and magnifications offer the potential…
▽ More
Task-specific deep learning models in histopathology offer promising opportunities for improving diagnosis, clinical research, and precision medicine. However, development of such models is often limited by availability of high-quality data. Foundation models in histopathology that learn general representations across a wide range of tissue types, diagnoses, and magnifications offer the potential to reduce the data, compute, and technical expertise necessary to develop task-specific deep learning models with the required level of model performance. In this work, we describe the development and evaluation of foundation models for histopathology via self-supervised learning (SSL). We first establish a diverse set of benchmark tasks involving 17 unique tissue types and 12 unique cancer types and spanning different optimal magnifications and task types. Next, we use this benchmark to explore and evaluate histopathology-specific SSL methods followed by further evaluation on held out patch-level and weakly supervised tasks. We found that standard SSL methods thoughtfully applied to histopathology images are performant across our benchmark tasks and that domain-specific methodological improvements can further increase performance. Our findings reinforce the value of using domain-specific SSL methods in pathology, and establish a set of high quality foundation models to enable further research across diverse applications.
△ Less
Submitted 19 October, 2023;
originally announced October 2023.
-
$f$-Policy Gradients: A General Framework for Goal Conditioned RL using $f$-Divergences
Authors:
Siddhant Agarwal,
Ishan Durugkar,
Peter Stone,
Amy Zhang
Abstract:
Goal-Conditioned Reinforcement Learning (RL) problems often have access to sparse rewards where the agent receives a reward signal only when it has achieved the goal, making policy optimization a difficult problem. Several works augment this sparse reward with a learned dense reward function, but this can lead to sub-optimal policies if the reward is misaligned. Moreover, recent works have demonst…
▽ More
Goal-Conditioned Reinforcement Learning (RL) problems often have access to sparse rewards where the agent receives a reward signal only when it has achieved the goal, making policy optimization a difficult problem. Several works augment this sparse reward with a learned dense reward function, but this can lead to sub-optimal policies if the reward is misaligned. Moreover, recent works have demonstrated that effective sha** rewards for a particular problem can depend on the underlying learning algorithm. This paper introduces a novel way to encourage exploration called $f$-Policy Gradients, or $f$-PG. $f$-PG minimizes the f-divergence between the agent's state visitation distribution and the goal, which we show can lead to an optimal policy. We derive gradients for various f-divergences to optimize this objective. Our learning paradigm provides dense learning signals for exploration in sparse reward settings. We further introduce an entropy-regularized policy optimization objective, that we call $state$-MaxEnt RL (or $s$-MaxEnt RL) as a special case of our objective. We show that several metric-based sha** rewards like L2 can be used with $s$-MaxEnt RL, providing a common ground to study such metric-based sha** rewards with efficient exploration. We find that $f$-PG has better performance compared to standard policy gradient methods on a challenging gridworld as well as the Point Maze and FetchReach environments. More information on our website https://agarwalsiddhant10.github.io/projects/fpg.html.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
Quantum advantage of time-reversed ancilla-based metrology of absorption parameters
Authors:
Jiaxuan Wang,
Ruynet. L. de Matos Filho,
Girish S. Agarwal,
Luiz Davidovich
Abstract:
Quantum estimation of parameters defining open-system dynamics may be enhanced by using ancillas that are entangled with the probe but are not submitted to the dynamics. Here we consider the important problem of estimation of transmission of light by a sample, with losses due to absorption and scattering. We show, through the determination of the quantum Fisher information, that the ancilla strate…
▽ More
Quantum estimation of parameters defining open-system dynamics may be enhanced by using ancillas that are entangled with the probe but are not submitted to the dynamics. Here we consider the important problem of estimation of transmission of light by a sample, with losses due to absorption and scattering. We show, through the determination of the quantum Fisher information, that the ancilla strategy leads to the best possible precision in single-mode estimation, the one obtained for a Fock state input, through joint photon-counting of probe and ancilla, which are modes of a bimodal squeezed state produced by an optical parametric amplifier. This proposal overcomes the challenge of producing and detecting high photon-number Fock states, and it is quite robust against additional noise: we show that it is immune to phase noise and the precision does not change if the incoming state gets disentangled. Furthermore, the quantum gain is still present under moderate photon losses of the input beams. We also discuss an alternative to joint photon counting, which is readily implementable with present technology, and approaches the quantum Fisher information result for weak absorption, even with moderate photons losses of the input beams before the sample is probed: a time-reversal procedure, placing the sample between two optical parametric amplifiers, with the second undoing the squeezing produced by the first one. The precision of estimation of the loss parameter is obtained from the average outgoing total photon number and its variance. In both procedures, the state of the probe and the detection procedure are independent of the value of the parameter.
△ Less
Submitted 6 December, 2023; v1 submitted 9 October, 2023;
originally announced October 2023.
-
Outage-Watch: Early Prediction of Outages using Extreme Event Regularizer
Authors:
Shubham Agarwal,
Sarthak Chakraborty,
Shaddy Garg,
Sumit Bisht,
Chahat Jain,
Ashritha Gonuguntla,
Shiv Saini
Abstract:
Cloud services are omnipresent and critical cloud service failure is a fact of life. In order to retain customers and prevent revenue loss, it is important to provide high reliability guarantees for these services. One way to do this is by predicting outages in advance, which can help in reducing the severity as well as time to recovery. It is difficult to forecast critical failures due to the rar…
▽ More
Cloud services are omnipresent and critical cloud service failure is a fact of life. In order to retain customers and prevent revenue loss, it is important to provide high reliability guarantees for these services. One way to do this is by predicting outages in advance, which can help in reducing the severity as well as time to recovery. It is difficult to forecast critical failures due to the rarity of these events. Moreover, critical failures are ill-defined in terms of observable data. Our proposed method, Outage-Watch, defines critical service outages as deteriorations in the Quality of Service (QoS) captured by a set of metrics. Outage-Watch detects such outages in advance by using current system state to predict whether the QoS metrics will cross a threshold and initiate an extreme event. A mixture of Gaussian is used to model the distribution of the QoS metrics for flexibility and an extreme event regularizer helps in improving learning in tail of the distribution. An outage is predicted if the probability of any one of the QoS metrics crossing threshold changes significantly. Our evaluation on a real-world SaaS company dataset shows that Outage-Watch significantly outperforms traditional methods with an average AUC of 0.98. Additionally, Outage-Watch detects all the outages exhibiting a change in service metrics and reduces the Mean Time To Detection (MTTD) of outages by up to 88% when deployed in an enterprise cloud-service system, demonstrating efficacy of our proposed method.
△ Less
Submitted 10 November, 2023; v1 submitted 29 September, 2023;
originally announced September 2023.
-
Symbolic Regression on Sparse and Noisy Data with Gaussian Processes
Authors:
Junette Hsin,
Shubhankar Agarwal,
Adam Thorpe,
Luis Sentis,
David Fridovich-Keil
Abstract:
In this paper, we address the challenge of deriving dynamical models from sparse and noisy data. High-quality data is crucial for symbolic regression algorithms; limited and noisy data can present modeling challenges. To overcome this, we combine Gaussian process regression with a sparse identification of nonlinear dynamics (SINDy) method to denoise the data and identify nonlinear dynamical equati…
▽ More
In this paper, we address the challenge of deriving dynamical models from sparse and noisy data. High-quality data is crucial for symbolic regression algorithms; limited and noisy data can present modeling challenges. To overcome this, we combine Gaussian process regression with a sparse identification of nonlinear dynamics (SINDy) method to denoise the data and identify nonlinear dynamical equations. Our simple approach offers improved robustness with sparse, noisy data compared to SINDy alone. We demonstrate its effectiveness on a Lotka-Volterra model, a unicycle dynamic model in simulation, and hardware data from an NVIDIA JetRacer system. We show superior performance over baselines including 20.78% improvement over SINDy and 61.92% improvement over SSR in predicting future trajectories from discovered dynamics.
△ Less
Submitted 27 March, 2024; v1 submitted 20 September, 2023;
originally announced September 2023.
-
A Real-Time Approach for Smart Building Operations Prediction Using Rule-Based Complex Event Processing and SPARQL Query
Authors:
Shashi Shekhar Kumar,
Ritesh Chandra,
Sonali Agarwal
Abstract:
Due to intelligent, adaptive nature towards various operations and their ability to provide maximum comfort to the occupants residing in them, smart buildings are becoming a pioneering area of research. Since these architectures leverage the Internet of Things (IoT), there is a need for monitoring different operations (Occupancy, Humidity, Temperature, CO2, etc.) to provide sustainable comfort to…
▽ More
Due to intelligent, adaptive nature towards various operations and their ability to provide maximum comfort to the occupants residing in them, smart buildings are becoming a pioneering area of research. Since these architectures leverage the Internet of Things (IoT), there is a need for monitoring different operations (Occupancy, Humidity, Temperature, CO2, etc.) to provide sustainable comfort to the occupants. This paper proposes a novel approach for intelligent building operations monitoring using rule-based complex event processing and query-based approaches for dynamically monitoring the different operations. Siddhi is a complex event processing engine designed for handling multiple sources of event data in real time and processing it according to predefined rules using a decision tree. Since streaming data is dynamic in nature, to keep track of different operations, we have converted the IoT data into an RDF dataset. The RDF dataset is ingested to Apache Kafka for streaming purposes and for stored data we have used the GraphDB tool that extracts information with the help of SPARQL query. Consequently, the proposed approach is also evaluated by deploying the large number of events through the Siddhi CEP engine and how efficiently they are processed in terms of time. Apart from that, a risk estimation scenario is also designed to generate alerts for end users in case any of the smart building operations need immediate attention. The output is visualized and monitored for the end user through a tableau dashboard.
△ Less
Submitted 28 August, 2023;
originally announced September 2023.
-
Asynchronous Perception-Action-Communication with Graph Neural Networks
Authors:
Saurav Agarwal,
Alejandro Ribeiro,
Vijay Kumar
Abstract:
Collaboration in large robot swarms to achieve a common global objective is a challenging problem in large environments due to limited sensing and communication capabilities. The robots must execute a Perception-Action-Communication (PAC) loop -- they perceive their local environment, communicate with other robots, and take actions in real time. A fundamental challenge in decentralized PAC systems…
▽ More
Collaboration in large robot swarms to achieve a common global objective is a challenging problem in large environments due to limited sensing and communication capabilities. The robots must execute a Perception-Action-Communication (PAC) loop -- they perceive their local environment, communicate with other robots, and take actions in real time. A fundamental challenge in decentralized PAC systems is to decide what information to communicate with the neighboring robots and how to take actions while utilizing the information shared by the neighbors. Recently, this has been addressed using Graph Neural Networks (GNNs) for applications such as flocking and coverage control. Although conceptually, GNN policies are fully decentralized, the evaluation and deployment of such policies have primarily remained centralized or restrictively decentralized. Furthermore, existing frameworks assume sequential execution of perception and action inference, which is very restrictive in real-world applications. This paper proposes a framework for asynchronous PAC in robot swarms, where decentralized GNNs are used to compute navigation actions and generate messages for communication. In particular, we use aggregated GNNs, which enable the exchange of hidden layer information between robots for computational efficiency and decentralized inference of actions. Furthermore, the modules in the framework are asynchronous, allowing robots to perform sensing, extracting information, communication, action inference, and control execution at different frequencies. We demonstrate the effectiveness of GNNs executed in the proposed framework in navigating large robot swarms for collaborative coverage of large environments.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
ESRO: Experience Assisted Service Reliability against Outages
Authors:
Sarthak Chakraborty,
Shubham Agarwal,
Shaddy Garg,
Abhimanyu Sethia,
Udit Narayan Pandey,
Videh Aggarwal,
Shiv Saini
Abstract:
Modern cloud services are prone to failures due to their complex architecture, making diagnosis a critical process. Site Reliability Engineers (SREs) spend hours leveraging multiple sources of data, including the alerts, error logs, and domain expertise through past experiences to locate the root cause(s). These experiences are documented as natural language text in outage reports for previous out…
▽ More
Modern cloud services are prone to failures due to their complex architecture, making diagnosis a critical process. Site Reliability Engineers (SREs) spend hours leveraging multiple sources of data, including the alerts, error logs, and domain expertise through past experiences to locate the root cause(s). These experiences are documented as natural language text in outage reports for previous outages. However, utilizing the raw yet rich semi-structured information in the reports systematically is time-consuming. Structured information, on the other hand, such as alerts that are often used during fault diagnosis, is voluminous and requires expert knowledge to discern. Several strategies have been proposed to use each source of data separately for root cause analysis. In this work, we build a diagnostic service called ESRO that recommends root causes and remediation for failures by utilizing structured as well as semi-structured sources of data systematically. ESRO constructs a causal graph using alerts and a knowledge graph using outage reports, and merges them in a novel way to form a unified graph during training. A retrieval-based mechanism is then used to search the unified graph and rank the likely root causes and remediation techniques based on the alerts fired during an outage at inference time. Not only the individual alerts, but their respective importance in predicting an outage group is taken into account during recommendation. We evaluated our model on several cloud service outages of a large SaaS enterprise over the course of ~2 years, and obtained an average improvement of 27% in rouge scores after comparing the likely root causes against the ground truth over state-of-the-art baselines. We further establish the effectiveness of ESRO through qualitative analysis on multiple real outage examples.
△ Less
Submitted 13 September, 2023;
originally announced September 2023.
-
3D Active Metric-Semantic SLAM
Authors:
Yuezhan Tao,
Xu Liu,
Igor Spasojevic,
Saurav Agarwal,
Vijay Kumar
Abstract:
In this letter, we address the problem of exploration and metric-semantic map** of multi-floor GPS-denied indoor environments using Size Weight and Power (SWaP) constrained aerial robots. Most previous work in exploration assumes that robot localization is solved. However, neglecting the state uncertainty of the agent can ultimately lead to cascading errors both in the resulting map and in the s…
▽ More
In this letter, we address the problem of exploration and metric-semantic map** of multi-floor GPS-denied indoor environments using Size Weight and Power (SWaP) constrained aerial robots. Most previous work in exploration assumes that robot localization is solved. However, neglecting the state uncertainty of the agent can ultimately lead to cascading errors both in the resulting map and in the state of the agent itself. Furthermore, actions that reduce localization errors may be at direct odds with the exploration task. We propose a framework that balances the efficiency of exploration with actions that reduce the state uncertainty of the agent. In particular, our algorithmic approach for active metric-semantic SLAM is built upon sparse information abstracted from raw problem data, to make it suitable for SWaP-constrained robots. Furthermore, we integrate this framework within a fully autonomous aerial robotic system that achieves autonomous exploration in cluttered, 3D environments. From extensive real-world experiments, we showed that by including Semantic Loop Closure (SLC), we can reduce the robot pose estimation errors by over 90% in translation and approximately 75% in yaw, and the uncertainties in pose estimates and semantic maps by over 70% and 65%, respectively. Although discussed in the context of indoor multi-floor exploration, our system can be used for various other applications, such as infrastructure inspection and precision agriculture where reliable GPS data may not be available.
△ Less
Submitted 13 February, 2024; v1 submitted 13 September, 2023;
originally announced September 2023.
-
Single-photon induced instabilities in a cavity electromechanical device
Authors:
Tanmoy Bera,
Mridul Kandpal,
G. S. Agarwal,
Vibhor Singh
Abstract:
Cavity-electromechanical systems are extensively used for sensing and controlling the vibrations of mechanical resonators down to their quantum limit. The nonlinear radiation-pressure interaction in these systems could result in an unstable response of the mechanical resonator showing features such as frequency-combs, period-doubling bifurcations and chaos. However, due to weak light-matter intera…
▽ More
Cavity-electromechanical systems are extensively used for sensing and controlling the vibrations of mechanical resonators down to their quantum limit. The nonlinear radiation-pressure interaction in these systems could result in an unstable response of the mechanical resonator showing features such as frequency-combs, period-doubling bifurcations and chaos. However, due to weak light-matter interaction, typically these effects appear at very high driving strengths. By using polariton modes formed by a strongly coupled flux-tunable transmon and a microwave cavity, here we demonstrate an electromechanical device and achieve a single-photon coupling rate $g_0/2π$ of $160~$kHz, which is nearly 4\% of the mechanical frequency $ω_m$. Due to large $g_0/ω_m$ ratio, the device shows an unstable mechanical response resulting in frequency combs in sub-single photon limit. We systematically investigate the boundary of the unstable response and identify two important regimes governed by the optomechanical backaction and the nonlinearity of the electromagnetic mode. Such an improvement in the single-photon coupling rate and the observations of microwave frequency combs at single-photon levels may have applications in the quantum control of the motional states and critical parametric sensing. Our experiments strongly suggest the requirement of newer approaches to understand instabilities.
△ Less
Submitted 2 May, 2024; v1 submitted 13 September, 2023;
originally announced September 2023.
-
Topological transitions in dissipatively coupled Su-Schrieffer-Heeger models
Authors:
Jayakrishnan M. P. Nair,
Marlan O. Scully,
Girish S. Agarwal
Abstract:
Non-Hermitian topological phenomena have gained much interest among physicists in recent years. In this paper, we expound on the physics of dissipatively coupled Su-Schrieffer-Heeger (SSH) lattices, specifically in systems with bosonic and electrical constituents. In the context of electrical circuits, we demonstrate that a series of resistively coupled LCR circuits mimics the topology of a dissip…
▽ More
Non-Hermitian topological phenomena have gained much interest among physicists in recent years. In this paper, we expound on the physics of dissipatively coupled Su-Schrieffer-Heeger (SSH) lattices, specifically in systems with bosonic and electrical constituents. In the context of electrical circuits, we demonstrate that a series of resistively coupled LCR circuits mimics the topology of a dissipatively coupled SSH model. In addition, we foreground a scheme to construct dissipatively coupled SSH lattices involving a set of non-interacting bosonic oscillators weakly coupled to engineered reservoirs of modes possessing substantially small lifetimes when compared to other system timescales. Further, by activating the coherent coupling between bosonic oscillators, we elucidate the emergence of non-reciprocal dissipative coupling which can be controlled by the phase of the coherent interaction strength precipitating in phase-dependent topological transitions and skin effect. Our analyses are generic, apropos of a large class of systems involving, for instance, optical and microwave settings, while the circuit implementation represents the most straightforward of them.
△ Less
Submitted 11 September, 2023;
originally announced September 2023.
-
Multi-modal Extreme Classification
Authors:
Anshul Mittal,
Kunal Dahiya,
Shreya Malani,
Janani Ramaswamy,
Seba Kuruvilla,
Jitendra Ajmera,
Keng-hao Chang,
Sumeet Agarwal,
Purushottam Kar,
Manik Varma
Abstract:
This paper develops the MUFIN technique for extreme classification (XC) tasks with millions of labels where datapoints and labels are endowed with visual and textual descriptors. Applications of MUFIN to product-to-product recommendation and bid query prediction over several millions of products are presented. Contemporary multi-modal methods frequently rely on purely embedding-based methods. On t…
▽ More
This paper develops the MUFIN technique for extreme classification (XC) tasks with millions of labels where datapoints and labels are endowed with visual and textual descriptors. Applications of MUFIN to product-to-product recommendation and bid query prediction over several millions of products are presented. Contemporary multi-modal methods frequently rely on purely embedding-based methods. On the other hand, XC methods utilize classifier architectures to offer superior accuracies than embedding only methods but mostly focus on text-based categorization tasks. MUFIN bridges this gap by reformulating multi-modal categorization as an XC problem with several millions of labels. This presents the twin challenges of develo** multi-modal architectures that can offer embeddings sufficiently expressive to allow accurate categorization over millions of labels; and training and inference routines that scale logarithmically in the number of labels. MUFIN develops an architecture based on cross-modal attention and trains it in a modular fashion using pre-training and positive and negative mining. A novel product-to-product recommendation dataset MM-AmazonTitles-300K containing over 300K products was curated from publicly available amazon.com listings with each product endowed with a title and multiple images. On the all datasets MUFIN offered at least 3% higher accuracy than leading text-based, image-based and multi-modal techniques. Code for MUFIN is available at https://github.com/Extreme-classification/MUFIN
△ Less
Submitted 10 September, 2023;
originally announced September 2023.
-
Electronic Structure Prediction of Multi-million Atom Systems Through Uncertainty Quantification Enabled Transfer Learning
Authors:
Shashank Pathrudkar,
Ponkrshnan Thiagarajan,
Shivang Agarwal,
Amartya S. Banerjee,
Susanta Ghosh
Abstract:
The ground state electron density -- obtainable using Kohn-Sham Density Functional Theory (KS-DFT) simulations -- contains a wealth of material information, making its prediction via machine learning (ML) models attractive. However, the computational expense of KS-DFT scales cubically with system size which tends to stymie training data generation, making it difficult to develop quantifiably accur…
▽ More
The ground state electron density -- obtainable using Kohn-Sham Density Functional Theory (KS-DFT) simulations -- contains a wealth of material information, making its prediction via machine learning (ML) models attractive. However, the computational expense of KS-DFT scales cubically with system size which tends to stymie training data generation, making it difficult to develop quantifiably accurate ML models that are applicable across many scales and system configurations. Here, we address this fundamental challenge by employing transfer learning to leverage the multi-scale nature of the training data, while comprehensively sampling system configurations using thermalization. Our ML models are less reliant on heuristics, and being based on Bayesian neural networks, enable uncertainty quantification. We show that our models incur significantly lower data generation costs while allowing confident -- and when verifiable, accurate -- predictions for a wide variety of bulk systems well beyond training, including systems with defects, different alloy compositions, and at unprecedented, multi-million-atom scales. Moreover, such predictions can be carried out using only modest computational resources.
△ Less
Submitted 1 May, 2024; v1 submitted 24 August, 2023;
originally announced August 2023.
-
Incorporating Nonlocal Traffic Flow Model in Physics-informed Neural Networks
Authors:
Archie J. Huang,
Animesh Biswas,
Shaurya Agarwal
Abstract:
This research contributes to the advancement of traffic state estimation methods by leveraging the benefits of the nonlocal LWR model within a physics-informed deep learning framework. The classical LWR model, while useful, falls short of accurately representing real-world traffic flows. The nonlocal LWR model addresses this limitation by considering the speed as a weighted mean of the downstream…
▽ More
This research contributes to the advancement of traffic state estimation methods by leveraging the benefits of the nonlocal LWR model within a physics-informed deep learning framework. The classical LWR model, while useful, falls short of accurately representing real-world traffic flows. The nonlocal LWR model addresses this limitation by considering the speed as a weighted mean of the downstream traffic density. In this paper, we propose a novel PIDL framework that incorporates the nonlocal LWR model. We introduce both fixed-length and variable-length kernels and develop the required mathematics. The proposed PIDL framework undergoes a comprehensive evaluation, including various convolutional kernels and look-ahead windows, using data from the NGSIM and CitySim datasets. The results demonstrate improvements over the baseline PIDL approach using the local LWR model. The findings highlight the potential of the proposed approach to enhance the accuracy and reliability of traffic state estimation, enabling more effective traffic management strategies.
△ Less
Submitted 22 August, 2023;
originally announced August 2023.
-
Explainable Machine Learning for Hydrogen Diffusion in Metals and Random Binary Alloys
Authors:
Grace M. Lu,
Matthew Witman,
Sapan Agarwal,
Vitalie Stavila,
Dallas R. Trinkle
Abstract:
Hydrogen diffusion in metals and alloys plays an important role in the discovery of new materials for fuel cell and energy storage technology. While analytic models use hand-selected features that have clear physical ties to hydrogen diffusion, they often lack accuracy when making quantitative predictions. Machine learning models are capable of making accurate predictions, but their inner workings…
▽ More
Hydrogen diffusion in metals and alloys plays an important role in the discovery of new materials for fuel cell and energy storage technology. While analytic models use hand-selected features that have clear physical ties to hydrogen diffusion, they often lack accuracy when making quantitative predictions. Machine learning models are capable of making accurate predictions, but their inner workings are obscured, rendering it unclear which physical features are truly important. To develop interpretable machine learning models to predict the activation energies of hydrogen diffusion in metals and random binary alloys, we create a database for physical and chemical properties of the species and use it to fit six machine learning models. Our models achieve root-mean-squared-errors between 98-119 meV on the testing data and accurately predict that elemental Ru has a large activation energy, while elemental Cr and Fe have small activation energies. By analyzing the feature importances of these fitted models, we identify relevant physical properties for predicting hydrogen diffusivity. While metrics for measuring the individual feature importances for machine learning models exist, correlations between the features lead to disagreement between models and limit the conclusions that can be drawn. Instead grouped feature importances, formed by combining the features via their correlations, agree across the six models and reveal that the two groups containing the packing factor and electronic specific heat are particularly significant for predicting hydrogen diffusion in metals and random binary alloys. This framework allows us to interpret machine learning models and enables rapid screening of new materials with the desired rates of hydrogen diffusion.
△ Less
Submitted 26 October, 2023; v1 submitted 15 August, 2023;
originally announced August 2023.
-
Reinforcement Learning (RL) Augmented Cold Start Frequency Reduction in Serverless Computing
Authors:
Siddharth Agarwal,
Maria A. Rodriguez,
Rajkumar Buyya
Abstract:
Function-as-a-Service is a cloud computing paradigm offering an event-driven execution model to applications. It features serverless attributes by eliminating resource management responsibilities from developers and offers transparent and on-demand scalability of applications. Typical serverless applications have stringent response time and scalability requirements and therefore rely on deployed s…
▽ More
Function-as-a-Service is a cloud computing paradigm offering an event-driven execution model to applications. It features serverless attributes by eliminating resource management responsibilities from developers and offers transparent and on-demand scalability of applications. Typical serverless applications have stringent response time and scalability requirements and therefore rely on deployed services to provide quick and fault-tolerant feedback to clients. However, the FaaS paradigm suffers from cold starts as there is a non-negligible delay associated with on-demand function initialization. This work focuses on reducing the frequency of cold starts on the platform by using Reinforcement Learning. Our approach uses Q-learning and considers metrics such as function CPU utilization, existing function instances, and response failure rate to proactively initialize functions in advance based on the expected demand. The proposed solution was implemented on Kubeless and was evaluated using a normalised real-world function demand trace with matrix multiplication as the workload. The results demonstrate a favourable performance of the RL-based agent when compared to Kubeless' default policy and function keep-alive policy by improving throughput by up to 8.81% and reducing computation load and resource wastage by up to 55% and 37%, respectively, which is a direct outcome of reduced cold starts.
△ Less
Submitted 14 August, 2023;
originally announced August 2023.
-
A Deep Recurrent-Reinforcement Learning Method for Intelligent AutoScaling of Serverless Functions
Authors:
Siddharth Agarwal,
Maria A. Rodriguez,
Rajkumar Buyya
Abstract:
Function-as-a-Service (FaaS) introduces a lightweight, function-based cloud execution model that finds its relevance in applications like IoT-edge data processing and anomaly detection. While CSP offer a near-infinite function elasticity, these applications often experience fluctuating workloads and stricter performance constraints. A typical CSP strategy is to empirically determine and adjust des…
▽ More
Function-as-a-Service (FaaS) introduces a lightweight, function-based cloud execution model that finds its relevance in applications like IoT-edge data processing and anomaly detection. While CSP offer a near-infinite function elasticity, these applications often experience fluctuating workloads and stricter performance constraints. A typical CSP strategy is to empirically determine and adjust desired function instances, "autoscaling", based on monitoring-based thresholds such as CPU or memory, to cope with demand and performance. However, threshold configuration either requires expert knowledge, historical data or a complete view of environment, making autoscaling a performance bottleneck lacking an adaptable solution.RL algorithms are proven to be beneficial in analysing complex cloud environments and result in an adaptable policy that maximizes the expected objectives. Most realistic cloud environments usually involve operational interference and have limited visibility, making them partially observable. A general solution to tackle observability in highly dynamic settings is to integrate Recurrent units with model-free RL algorithms and model a decision process as a POMDP. Therefore, in this paper, we investigate a model-free Recurrent RL agent for function autoscaling and compare it against the model-free Proximal Policy Optimisation (PPO) algorithm. We explore the integration of a LSTM network with the state-of-the-art PPO algorithm to find that under our experimental and evaluation settings, recurrent policies were able to capture the environment parameters and show promising results for function autoscaling. We further compare a PPO-based autoscaling agent with commercially used threshold-based function autoscaling and posit that a LSTM-based autoscaling agent is able to improve throughput by 18%, function execution by 13% and account for 8.4% more function instances.
△ Less
Submitted 11 August, 2023;
originally announced August 2023.
-
Density-contrast induced inertial forces on particles in oscillatory flows
Authors:
Siddhansh Agarwal,
Gaurav Upadhyay,
Yashraj Bhosale,
Mattia Gazzola,
Sascha Hilgenfeldt
Abstract:
Oscillatory flows have become an indispensable tool in microfluidics, inducing inertial effects for displacing and manipulating fluid-borne objects in a reliable, controllable, and label-free fashion. However, the quantitative description of such effects has been confined to limit cases and specialized scenarios. Here we develop an analytical formalism yielding the equation of motion of density-mi…
▽ More
Oscillatory flows have become an indispensable tool in microfluidics, inducing inertial effects for displacing and manipulating fluid-borne objects in a reliable, controllable, and label-free fashion. However, the quantitative description of such effects has been confined to limit cases and specialized scenarios. Here we develop an analytical formalism yielding the equation of motion of density-mismatched spherical particles in arbitrary background flows, generalizing previous work. Inertial force terms are systematically derived from the geometry of the flow field together with analytically known Stokes number dependences. Supported by independent, first-principles direct numerical simulations, we find that these forces are important even for nearly density-matched objects such as cells or bacteria, enabling their fast displacement and separation. Our formalism thus generalizes the Maxey--Riley equation, encompassing not only particle inertia, but consistently recovering, in the limit of large Stokes numbers, the Auton modification to added mass as well as the far-field acoustofluidic secondary radiation force.
△ Less
Submitted 8 August, 2023;
originally announced August 2023.
-
$THH$ of the Morava $E$-theory Spectrum $E_{2}$
Authors:
Sanjana Agarwal
Abstract:
The Morava $E$-theories, $E_{n}$, are complex-oriented $2$-periodic ring spectra, with homotopy groups $\mathbb{W}_{\mathbb{F}_{p^{n}}}[[u_{1}, u_{2}, ... , u_{n-1}]][u,u^{-1}]$. Here $\mathbb{W}$ denotes the Witt vector ring. $E_{n}$ is a Landweber exact spectrum and hence uniquely determined by this ring as $BP_{\ast}$-algebra. Algebraic $K$-theory of $E_{n}$ is a key ingredient towards analyzin…
▽ More
The Morava $E$-theories, $E_{n}$, are complex-oriented $2$-periodic ring spectra, with homotopy groups $\mathbb{W}_{\mathbb{F}_{p^{n}}}[[u_{1}, u_{2}, ... , u_{n-1}]][u,u^{-1}]$. Here $\mathbb{W}$ denotes the Witt vector ring. $E_{n}$ is a Landweber exact spectrum and hence uniquely determined by this ring as $BP_{\ast}$-algebra. Algebraic $K$-theory of $E_{n}$ is a key ingredient towards analyzing the layers in the $p$-complete Waldhausen $K$-theory chromatic tower. One hopes to use the machinery of trace methods to get results towards $K$-theory once the computation for $THH(E_{n})$ is known.
In this paper we describe $THH(E_{2})$ as part of consecutive chain of cofiber sequences where each cofiber sits in the next cofiber sequence and the first term of each cofiber sequence is describable completely in terms of suspensions and localizations of $E_{2}$. For these results, we first calculate $K(i)$-homology of $THH(E_{2})$ using a Bökstedt spectral sequence and then lift the generating classes of $K(1)$-homology to fundamental classes in homotopy group of $THH(E_{2})$. These lifts allow us to construct terms of the cofiber sequence and explicitly understand how they map to $THH(E_{2})$.
△ Less
Submitted 27 July, 2023;
originally announced July 2023.
-
Magnetohydrodynamics simulation of magnetic flux rope formation in a quadrupolar magnetic field configuration
Authors:
Sanjay Kumar,
Avijeet Prasad,
Sushree S. Nayak,
Satyam Agarwal,
R. Bhattacharyya
Abstract:
Magnetic flux ropes (MFRs) play an important role in high-energetic events like solar flares and coronal mass ejections in the solar atmosphere. Importantly, solar observations suggest an association of some flaring events with quadrupolar magnetic configurations. However, the formation and subsequent evolution of MFRs in such magnetic configurations still need to be fully understood. In this pape…
▽ More
Magnetic flux ropes (MFRs) play an important role in high-energetic events like solar flares and coronal mass ejections in the solar atmosphere. Importantly, solar observations suggest an association of some flaring events with quadrupolar magnetic configurations. However, the formation and subsequent evolution of MFRs in such magnetic configurations still need to be fully understood. In this paper, we present idealized magnetohydrodynamics (MHD) simulations of MFR formation in a quadrupolar magnetic configuration. A suitable initial magnetic field having a quadrupolar configuration is constructed by modifying a three-dimensional (3D) linear force-free magnetic field. The initial magnetic field contains neutral lines, which consist of X-type null points. The simulated dynamics initially demonstrate the oppositely directed magnetic field lines located across the polarity inversion lines (PILs) moving towards each other, resulting in magnetic reconnections. Due to these reconnections, four highly twisted MFRs form over the PILs. With time, the foot points of the MFRs move towards the X-type neutral lines and reconnect, generating complex magnetic structures around the neutral lines, thus making the MFR topology more complex in the quadrupolar configuration than those formed in bipolar loop systems. Further evolution reveals the non-uniform rise of the MFRs. Importantly, the simulations indicate that the pre-existing X-type null points in magnetic configurations can be crucial to the evolution of the MFRs and may lead to the observed brightenings during the onset of some flaring events in the quadrupolar configurations.
△ Less
Submitted 12 July, 2023;
originally announced July 2023.
-
Engineering bound states in continuum via nonlinearity induced extra dimension
Authors:
Qingtian Miao,
Jayakrishnan M. P. Nair,
Girish S. Agarwal
Abstract:
Bound states in continuum (BICs) are localized states of a system possessing significantly large life times with applications across various branches of science. In this work, we propose an expedient protocol to engineer BICs which involves the use of Kerr nonlinearities in the system. The generation of BICs is a direct artifact of the nonlinearity and the associated expansion in the dimensionalit…
▽ More
Bound states in continuum (BICs) are localized states of a system possessing significantly large life times with applications across various branches of science. In this work, we propose an expedient protocol to engineer BICs which involves the use of Kerr nonlinearities in the system. The generation of BICs is a direct artifact of the nonlinearity and the associated expansion in the dimensionality of the system. In particular, we consider single and two mode anharmonic systems and provide a number of solutions apposite for the creation of BICs. In close vicinity to the BIC, the steady state response of the system is immensely sensitive to perturbations in natural frequencies of the system and we illustrate its propitious sensing potential in the context of experimentally realizable setups for both optical and magnetic nonlinearities.
△ Less
Submitted 10 July, 2023;
originally announced July 2023.
-
Search-time Efficient Device Constraints-Aware Neural Architecture Search
Authors:
Oshin Dutta,
Tanu Kanvar,
Sumeet Agarwal
Abstract:
Edge computing aims to enable edge devices, such as IoT devices, to process data locally instead of relying on the cloud. However, deep learning techniques like computer vision and natural language processing can be computationally expensive and memory-intensive. Creating manual architectures specialized for each device is infeasible due to their varying memory and computational constraints. To ad…
▽ More
Edge computing aims to enable edge devices, such as IoT devices, to process data locally instead of relying on the cloud. However, deep learning techniques like computer vision and natural language processing can be computationally expensive and memory-intensive. Creating manual architectures specialized for each device is infeasible due to their varying memory and computational constraints. To address these concerns, we automate the construction of task-specific deep learning architectures optimized for device constraints through Neural Architecture Search (NAS). We present DCA-NAS, a principled method of fast neural network architecture search that incorporates edge-device constraints such as model size and floating-point operations. It incorporates weight sharing and channel bottleneck techniques to speed up the search time. Based on our experiments, we see that DCA-NAS outperforms manual architectures for similar sized models and is comparable to popular mobile architectures on various image classification datasets like CIFAR-10, CIFAR-100, and Imagenet-1k. Experiments with search spaces -- DARTS and NAS-Bench-201 show the generalization capabilities of DCA-NAS. On further evaluating our approach on Hardware-NAS-Bench, device-specific architectures with low inference latency and state-of-the-art performance were discovered.
△ Less
Submitted 10 July, 2023;
originally announced July 2023.
-
SkipDecode: Autoregressive Skip Decoding with Batching and Caching for Efficient LLM Inference
Authors:
Luciano Del Corro,
Allie Del Giorno,
Sahaj Agarwal,
Bin Yu,
Ahmed Awadallah,
Subhabrata Mukherjee
Abstract:
Autoregressive large language models (LLMs) have made remarkable progress in various natural language generation tasks. However, they incur high computation cost and latency resulting from the autoregressive token-by-token generation. To address this issue, several approaches have been proposed to reduce computational cost using early-exit strategies. These strategies enable faster text generation…
▽ More
Autoregressive large language models (LLMs) have made remarkable progress in various natural language generation tasks. However, they incur high computation cost and latency resulting from the autoregressive token-by-token generation. To address this issue, several approaches have been proposed to reduce computational cost using early-exit strategies. These strategies enable faster text generation using reduced computation without applying the full computation graph to each token. While existing token-level early exit methods show promising results for online inference, they cannot be readily applied for batch inferencing and Key-Value caching. This is because they have to wait until the last token in a batch exits before they can stop computing. This severely limits the practical application of such techniques. In this paper, we propose a simple and effective token-level early exit method, SkipDecode, designed to work seamlessly with batch inferencing and KV caching. It overcomes prior constraints by setting up a singular exit point for every token in a batch at each sequence position. It also guarantees a monotonic decrease in exit points, thereby eliminating the need to recompute KV Caches for preceding tokens. Rather than terminating computation prematurely as in prior works, our approach bypasses lower to middle layers, devoting most of the computational resources to upper layers, allowing later tokens to benefit from the compute expenditure by earlier tokens. Our experimental results show that SkipDecode can obtain 2x to 5x inference speedups with negligible regression across a variety of tasks. This is achieved using OPT models of 1.3 billion and 6.7 billion parameters, all the while being directly compatible with batching and KV caching optimization techniques.
△ Less
Submitted 5 July, 2023;
originally announced July 2023.
-
Orca: Progressive Learning from Complex Explanation Traces of GPT-4
Authors:
Subhabrata Mukherjee,
Arindam Mitra,
Ganesh Jawahar,
Sahaj Agarwal,
Hamid Palangi,
Ahmed Awadallah
Abstract:
Recent research has focused on enhancing the capability of smaller models through imitation learning, drawing on the outputs generated by large foundation models (LFMs). A number of issues impact the quality of these models, ranging from limited imitation signals from shallow LFM outputs; small scale homogeneous training data; and most notably a lack of rigorous evaluation resulting in overestimat…
▽ More
Recent research has focused on enhancing the capability of smaller models through imitation learning, drawing on the outputs generated by large foundation models (LFMs). A number of issues impact the quality of these models, ranging from limited imitation signals from shallow LFM outputs; small scale homogeneous training data; and most notably a lack of rigorous evaluation resulting in overestimating the small model's capability as they tend to learn to imitate the style, but not the reasoning process of LFMs. To address these challenges, we develop Orca (We are working with our legal team to publicly release a diff of the model weights in accordance with LLaMA's release policy to be published at https://aka.ms/orca-lm), a 13-billion parameter model that learns to imitate the reasoning process of LFMs. Orca learns from rich signals from GPT-4 including explanation traces; step-by-step thought processes; and other complex instructions, guided by teacher assistance from ChatGPT. To promote this progressive learning, we tap into large-scale and diverse imitation data with judicious sampling and selection. Orca surpasses conventional state-of-the-art instruction-tuned models such as Vicuna-13B by more than 100% in complex zero-shot reasoning benchmarks like Big-Bench Hard (BBH) and 42% on AGIEval. Moreover, Orca reaches parity with ChatGPT on the BBH benchmark and shows competitive performance (4 pts gap with optimized system message) in professional and academic examinations like the SAT, LSAT, GRE, and GMAT, both in zero-shot settings without CoT; while trailing behind GPT-4. Our research indicates that learning from step-by-step explanations, whether these are generated by humans or more advanced AI models, is a promising direction to improve model capabilities and skills.
△ Less
Submitted 5 June, 2023;
originally announced June 2023.
-
Nonreciprocal heat flux via synthetic fields in linear quantum systems
Authors:
S. -A. Biehs,
P. Rodriguez-Lopez,
M. Antezza,
G. S. Agarwal
Abstract:
We study the heat transfer between N coupled quantum resonators with applied synthetic electric and magnetic fields realized by changing the resonators parameters by external drivings. To this end we develop two general methods, based on the quantum optical master equation and on the Langevin equation for $N$ coupled oscillators where all quantum oscillators can have their own heat baths. The synt…
▽ More
We study the heat transfer between N coupled quantum resonators with applied synthetic electric and magnetic fields realized by changing the resonators parameters by external drivings. To this end we develop two general methods, based on the quantum optical master equation and on the Langevin equation for $N$ coupled oscillators where all quantum oscillators can have their own heat baths. The synthetic electric and magnetic fields are generated by a dynamical modulation of the oscillator resonance with a given phase. Using Floquet theory we solve the dynamical equations with both methods which allow us to determine the heat flux spectra and the transferred power. With apply these methods to study the specific case of a linear tight-binding chain of four quantum coupled resonators. We find that in that case, in addition to a non-reciprocal heat flux spectrum already predicted in previous investigations, the synthetic fields induce here non-reciprocity in the total heat flux hence realizing a net heat flux rectification.
△ Less
Submitted 12 June, 2023; v1 submitted 29 May, 2023;
originally announced May 2023.
-
Text-Augmented Open Knowledge Graph Completion via Pre-Trained Language Models
Authors:
Pengcheng Jiang,
Shivam Agarwal,
Bowen **,
Xuan Wang,
Jimeng Sun,
Jiawei Han
Abstract:
The mission of open knowledge graph (KG) completion is to draw new findings from known facts. Existing works that augment KG completion require either (1) factual triples to enlarge the graph reasoning space or (2) manually designed prompts to extract knowledge from a pre-trained language model (PLM), exhibiting limited performance and requiring expensive efforts from experts. To this end, we prop…
▽ More
The mission of open knowledge graph (KG) completion is to draw new findings from known facts. Existing works that augment KG completion require either (1) factual triples to enlarge the graph reasoning space or (2) manually designed prompts to extract knowledge from a pre-trained language model (PLM), exhibiting limited performance and requiring expensive efforts from experts. To this end, we propose TAGREAL that automatically generates quality query prompts and retrieves support information from large text corpora to probe knowledge from PLM for KG completion. The results show that TAGREAL achieves state-of-the-art performance on two benchmark datasets. We find that TAGREAL has superb performance even with limited training data, outperforming existing embedding-based, graph-based, and PLM-based methods.
△ Less
Submitted 24 May, 2023;
originally announced May 2023.
-
Enhancement of synthetic magnetic field induced nonreciprocity via bound states in continuum in dissipatively coupled systems
Authors:
S. -A. Biehs,
G. S. Agarwal
Abstract:
The nonreciprocal propagation of light typically requires use of materials like ferrites or magneto-optical media with a strong magnetic bias or methods based on material nonlinearities which require use of strong electromagnetic fields. A simpler possibility to produce nonreciprocity is to use spatio-temporal modulations to produce magnetic fields in synthetic dimensions. In this paper we show th…
▽ More
The nonreciprocal propagation of light typically requires use of materials like ferrites or magneto-optical media with a strong magnetic bias or methods based on material nonlinearities which require use of strong electromagnetic fields. A simpler possibility to produce nonreciprocity is to use spatio-temporal modulations to produce magnetic fields in synthetic dimensions. In this paper we show that dissipatively coupled systems can lead to considerable enhancement of nonreciprocity in synthetic fields. The enhancement comes about from the existence of nearly nondecaying mode -bound state in continuum (BIC) in dissipatively coupled systems. The dissipative coupling occurs in a wide class of systems coupled via transmission lines, waveguides, or nano fibers. The systems could be optical resonators or microscopic qubits. Remarkably we find that for specific choice of the modulation amplitudes, the transmission say in forward direction is completely extinguished whereas in the backward direction it becomes maximum. The synthetic fields produce transmission resonances which show significant line narrowing which owe their origin to existence of BIC's in dissipative systems.
△ Less
Submitted 24 May, 2023;
originally announced May 2023.
-
Deterministic Algorithmic Approaches to Solve Generalised Wordle
Authors:
Aditya Lahiri,
Naigam Shah,
Shivaank Agarwal,
Vignesh Nandakumar
Abstract:
Wordle is a single-player word-based game where the objective is to guess the 5-letter word in a maximum of 6 tries. The game was released to the public in October 2021 and has since gained popularity with people competing against each other to maintain daily streaks and guess the word in a minimum number of tries. There have been works using probabilistic and reinforcement learning based approach…
▽ More
Wordle is a single-player word-based game where the objective is to guess the 5-letter word in a maximum of 6 tries. The game was released to the public in October 2021 and has since gained popularity with people competing against each other to maintain daily streaks and guess the word in a minimum number of tries. There have been works using probabilistic and reinforcement learning based approaches to solve the game. Our work aims to formulate and analyze deterministic algorithms that can solve the game and minimize the number of turns required to guess the word and do so for any generalized setting of the game. As a simplifying assumption, for our analysis of all the algorithms we present, we assume that all letters will be unique in any word which is part of our vocabulary. We propose two algorithms to play Wordle - one a greedy based approach, and other based on Cliques. The Greedy approach is applicable for both hard and easy modes of Wordle, while the Clique formation based approach only works on the Easy mode. We present our analysis on both approaches one by one, next.
△ Less
Submitted 24 May, 2023;
originally announced May 2023.
-
Hall effect on the magnetic reconnections during the evolution of a three-dimensional magnetic flux rope
Authors:
Kamlesh Bora,
Satyam Agarwal,
Sanjay Kumar,
Ramit Bhattacharyya
Abstract:
We present a novel Hall magnetohydrodynamics (HMHD) numerical simulation of a three-dimensional (3D) magnetic flux rope (MFR) -- generated by magnetic reconnections from an initial 3D bipolar sheared field. Magnetic reconnections during the HMHD evolution are compared with the MHD. In both simulations, the MFRs generate as a consequence of the magnetic reconnection at null points which has not bee…
▽ More
We present a novel Hall magnetohydrodynamics (HMHD) numerical simulation of a three-dimensional (3D) magnetic flux rope (MFR) -- generated by magnetic reconnections from an initial 3D bipolar sheared field. Magnetic reconnections during the HMHD evolution are compared with the MHD. In both simulations, the MFRs generate as a consequence of the magnetic reconnection at null points which has not been realized in contemporary simulations. Interestingly, the evolution is faster and more intricate in the HMHD simulation. Repetitive development of the twisted magnetic field lines (MFL) in the vicinity of 3D nulls (reconnection site) is unique to the HMHD evolution of the MFR. The dynamical evolution of magnetic field lines around the reconnection site being affected by the Hall forcing, correspondingly affects the large-scale structures.
△ Less
Submitted 19 May, 2023;
originally announced May 2023.
-
Massively Multi-Lingual Event Understanding: Extraction, Visualization, and Search
Authors:
Chris Jenkins,
Shantanu Agarwal,
Joel Barry,
Steven Fincke,
Elizabeth Boschee
Abstract:
In this paper, we present ISI-Clear, a state-of-the-art, cross-lingual, zero-shot event extraction system and accompanying user interface for event visualization & search. Using only English training data, ISI-Clear makes global events available on-demand, processing user-supplied text in 100 languages ranging from Afrikaans to Yiddish. We provide multiple event-centric views of extracted events,…
▽ More
In this paper, we present ISI-Clear, a state-of-the-art, cross-lingual, zero-shot event extraction system and accompanying user interface for event visualization & search. Using only English training data, ISI-Clear makes global events available on-demand, processing user-supplied text in 100 languages ranging from Afrikaans to Yiddish. We provide multiple event-centric views of extracted events, including both a graphical representation and a document-level summary. We also integrate existing cross-lingual search algorithms with event extraction capabilities to provide cross-lingual event-centric search, allowing English-speaking users to search over events automatically extracted from a corpus of non-English documents, using either English natural language queries (e.g. cholera outbreaks in Iran) or structured queries (e.g. find all events of type Disease-Outbreak with agent cholera and location Iran).
△ Less
Submitted 17 May, 2023;
originally announced May 2023.
-
Cuttlefish: Low-Rank Model Training without All the Tuning
Authors:
Hongyi Wang,
Saurabh Agarwal,
Pongsakorn U-chupala,
Yoshiki Tanaka,
Eric P. Xing,
Dimitris Papailiopoulos
Abstract:
Recent research has shown that training low-rank neural networks can effectively reduce the total number of trainable parameters without sacrificing predictive accuracy, resulting in end-to-end speedups. However, low-rank model training necessitates adjusting several additional factorization hyperparameters, such as the rank of the factorization at each layer. In this paper, we tackle this challen…
▽ More
Recent research has shown that training low-rank neural networks can effectively reduce the total number of trainable parameters without sacrificing predictive accuracy, resulting in end-to-end speedups. However, low-rank model training necessitates adjusting several additional factorization hyperparameters, such as the rank of the factorization at each layer. In this paper, we tackle this challenge by introducing Cuttlefish, an automated low-rank training approach that eliminates the need for tuning factorization hyperparameters. Cuttlefish leverages the observation that after a few epochs of full-rank training, the stable rank (i.e., an approximation of the true rank) of each layer stabilizes at a constant value. Cuttlefish switches from full-rank to low-rank training once the stable ranks of all layers have converged, setting the dimension of each factorization to its corresponding stable rank. Our results show that Cuttlefish generates models up to 5.6 times smaller than full-rank models, and attains up to a 1.2 times faster end-to-end training process while preserving comparable accuracy. Moreover, Cuttlefish outperforms state-of-the-art low-rank model training methods and other prominent baselines. The source code for our implementation can be found at: https://github.com/hwang595/Cuttlefish.
△ Less
Submitted 5 May, 2023; v1 submitted 4 May, 2023;
originally announced May 2023.
-
EKILA: Synthetic Media Provenance and Attribution for Generative Art
Authors:
Kar Balan,
Shruti Agarwal,
Simon Jenni,
Andy Parsons,
Andrew Gilbert,
John Collomosse
Abstract:
We present EKILA; a decentralized framework that enables creatives to receive recognition and reward for their contributions to generative AI (GenAI). EKILA proposes a robust visual attribution technique and combines this with an emerging content provenance standard (C2PA) to address the problem of synthetic image provenance -- determining the generative model and training data responsible for an…
▽ More
We present EKILA; a decentralized framework that enables creatives to receive recognition and reward for their contributions to generative AI (GenAI). EKILA proposes a robust visual attribution technique and combines this with an emerging content provenance standard (C2PA) to address the problem of synthetic image provenance -- determining the generative model and training data responsible for an AI-generated image. Furthermore, EKILA extends the non-fungible token (NFT) ecosystem to introduce a tokenized representation for rights, enabling a triangular relationship between the asset's Ownership, Rights, and Attribution (ORA). Leveraging the ORA relationship enables creators to express agency over training consent and, through our attribution model, to receive apportioned credit, including royalty payments for the use of their assets in GenAI.
△ Less
Submitted 10 April, 2023;
originally announced April 2023.
-
RoSteALS: Robust Steganography using Autoencoder Latent Space
Authors:
Tu Bui,
Shruti Agarwal,
Ning Yu,
John Collomosse
Abstract:
Data hiding such as steganography and invisible watermarking has important applications in copyright protection, privacy-preserved communication and content provenance. Existing works often fall short in either preserving image quality, or robustness against perturbations or are too complex to train. We propose RoSteALS, a practical steganography technique leveraging frozen pretrained autoencoders…
▽ More
Data hiding such as steganography and invisible watermarking has important applications in copyright protection, privacy-preserved communication and content provenance. Existing works often fall short in either preserving image quality, or robustness against perturbations or are too complex to train. We propose RoSteALS, a practical steganography technique leveraging frozen pretrained autoencoders to free the payload embedding from learning the distribution of cover images. RoSteALS has a light-weight secret encoder of just 300k parameters, is easy to train, has perfect secret recovery performance and comparable image quality on three benchmarks. Additionally, RoSteALS can be adapted for novel cover-less steganography applications in which the cover image can be sampled from noise or conditioned on text prompts via a denoising diffusion process. Our model and code are available at \url{https://github.com/TuBui/RoSteALS}.
△ Less
Submitted 6 April, 2023;
originally announced April 2023.
-
Polaritonic Ultrastrong Coupling: Quantum Entanglement in Ground State
Authors:
Qingtian Miao,
G. S. Agarwal
Abstract:
The ultrastrong coupling between the elementary excitations of matter and microcavity modes is studied in a fully analytical quantum-mechanical theoretical framework. The elementary excitation could be phonons, excitons, plasmons, etc. From the diagonalization of the Hamiltonian, we obtain the ground state of the polariton Hamiltonian. The ground state belongs to the Gaussian class. Using the Gaus…
▽ More
The ultrastrong coupling between the elementary excitations of matter and microcavity modes is studied in a fully analytical quantum-mechanical theoretical framework. The elementary excitation could be phonons, excitons, plasmons, etc. From the diagonalization of the Hamiltonian, we obtain the ground state of the polariton Hamiltonian. The ground state belongs to the Gaussian class. Using the Gaussian property we calculate the quantum entanglement in the ground state. We use two different measures for quantum entanglement -- entanglement entropy and the logarithmic negativity parameter and obtain rather simple analytical expressions for the entanglement measures. Our findings show that the amount of quantum entanglement in the ground state is quite significant in the ultrastrong coupling regime. It can be obtained from the measurement of the polariton frequencies.
△ Less
Submitted 2 April, 2023;
originally announced April 2023.
-
Demystifying CXL Memory with Genuine CXL-Ready Systems and Devices
Authors:
Yan Sun,
Yifan Yuan,
Zeduo Yu,
Reese Kuper,
Chihun Song,
**ghan Huang,
Houxiang Ji,
Siddharth Agarwal,
Jiaqi Lou,
Ipoom Jeong,
Ren Wang,
Jung Ho Ahn,
Tianyin Xu,
Nam Sung Kim
Abstract:
The ever-growing demands for memory with larger capacity and higher bandwidth have driven recent innovations on memory expansion and disaggregation technologies based on Compute eXpress Link (CXL). Especially, CXL-based memory expansion technology has recently gained notable attention for its ability not only to economically expand memory capacity and bandwidth but also to decouple memory technolo…
▽ More
The ever-growing demands for memory with larger capacity and higher bandwidth have driven recent innovations on memory expansion and disaggregation technologies based on Compute eXpress Link (CXL). Especially, CXL-based memory expansion technology has recently gained notable attention for its ability not only to economically expand memory capacity and bandwidth but also to decouple memory technologies from a specific memory interface of the CPU. However, since CXL memory devices have not been widely available, they have been emulated using DDR memory in a remote NUMA node. In this paper, for the first time, we comprehensively evaluate a true CXL-ready system based on the latest 4th-generation Intel Xeon CPU with three CXL memory devices from different manufacturers. Specifically, we run a set of microbenchmarks not only to compare the performance of true CXL memory with that of emulated CXL memory but also to analyze the complex interplay between the CPU and CXL memory in depth. This reveals important differences between emulated CXL memory and true CXL memory, some of which will compel researchers to revisit the analyses and proposals from recent work. Next, we identify opportunities for memory-bandwidth-intensive applications to benefit from the use of CXL memory. Lastly, we propose a CXL-memory-aware dynamic page allocation policy, Caption to more efficiently use CXL memory as a bandwidth expander. We demonstrate that Caption can automatically converge to an empirically favorable percentage of pages allocated to CXL memory, which improves the performance of memory-bandwidth-intensive applications by up to 24% when compared to the default page allocation policy designed for traditional NUMA systems.
△ Less
Submitted 4 October, 2023; v1 submitted 27 March, 2023;
originally announced March 2023.