Search | arXiv e-print repository

A Comprehensive Convolutional Neural Network Architecture Design using Magnetic Skyrmion and Domain Wall

Authors: Saumya Gupta, Venkatesh Vadde, Bhaskaran Muralidharan, Abhishek Sharma

Abstract: Spintronic-based neuromorphic hardware enables high-density and rapid data processing at nanoscale lengths. leveraged by the topologically protected spin configurations and low current densities to manipulate magnetic structures such as skyrmion and domain wall. The paper presents a compact, energy-efficient multi-bit skyrmionic synapse and domain wall-based ReLU with max-pooling functionalities f… ▽ More Spintronic-based neuromorphic hardware enables high-density and rapid data processing at nanoscale lengths. leveraged by the topologically protected spin configurations and low current densities to manipulate magnetic structures such as skyrmion and domain wall. The paper presents a compact, energy-efficient multi-bit skyrmionic synapse and domain wall-based ReLU with max-pooling functionalities for hardware neural network applications. A 4-bit,5-bit, and 6-bit skyrmionic synapse is proposed, featuring a circular bilayer vortex-based geometry. The 4-bit skyrmionic synapse consumes an ultra-low energy of 0.8724 fJ per weight update. The proposed skyrmionic synapse comprises an ultra-thin ferromagnetic layer with a strong Dzyaloshinskii-Moriya interaction and a polarizer layer with a vortex-like spin configuration. The interaction between perpendicular current flow and the labyrinth maze-like uniaxial anisotropy profiles induce skyrmionic gyration, resulting in long-term potentiation (LTP) and long-term depression (LTD) that modifies the synaptic weights. We develop a phenomenology of the synaptic device, implementing 16-state (4-bit), 32-state (5-bit), and 64-state (6-bit) skyrmionic synapses, analyzing them quantitatively using micromagnetics simulations. Furthermore, we design a CMOS hybrid domain wall-based ReLU-max pooled circuit. The activation function works on the variation of the domain wall position implying variation in the device resistance on encountering uniaxial anisotropy variation along the track. To demonstrate the practical application of our 4-bit (16-state) skyrmionic synapse with domain wall-based ReLU-Max Pooling circuit we integrate it into an inference-based convolutional neural network (CNN) for pattern recognition, achieving a comparable accuracy of 98.07% to software-based ideal training. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 15 pages, 10 figures

arXiv:2406.19511 [pdf, ps, other]

Cohomology of Fuchsian groups and Fourier interpolation

Authors: Mathilde Gerbelli-Gauthier, Akshay Venkatesh

Abstract: We give a new proof of a Fourier interpolation result first proved by Radchenko-Viazovska, deriving it from a vanishing result of the first cohomology of a Fuchsian group with coefficients in the Weil representation. We give a new proof of a Fourier interpolation result first proved by Radchenko-Viazovska, deriving it from a vanishing result of the first cohomology of a Fuchsian group with coefficients in the Weil representation. △ Less

Submitted 27 June, 2024; originally announced June 2024.

arXiv:2406.17607 [pdf, other]

Low-Crosstalk, Silicon-Fabricated Optical Waveguides for Laser Delivery to Matter Qubits

Authors: Clayton L. Craft, Nicholas J. Barton, Andrew C. Klug, Kenneth Scalzi, Ian Wildemann, Pramod Asagodu, Joseph D. Broz, Nikola L. Porto, Michael Macalik, Anthony Rizzo, Garrett Percevault, Christopher C. Tison, A. Matthew Smith, Michael L. Fanto, James Schneeloch, Erin Sheridan, Dylan Heberle, Andrew Brownell, Vijay S. S. Sundaram, Venkatesh Deenadayalan, Matthew van Niekerk, Evan Manfreda-Schulz, Gregory A. Howland, Stefan F. Preble, Daniel Coleman , et al. (8 additional authors not shown)

Abstract: Reliable control of quantum information in matter-based qubits requires precisely applied external fields, and unaccounted for spatial cross-talk of these fields between adjacent qubits leads to loss of fidelity. We report a CMOS foundry-produced, micro-fabricated silicon nitride (Si3N4) optical waveguide for addressing a chain of eight, unequally-spaced trapped barium ions with crosstalk compatib… ▽ More Reliable control of quantum information in matter-based qubits requires precisely applied external fields, and unaccounted for spatial cross-talk of these fields between adjacent qubits leads to loss of fidelity. We report a CMOS foundry-produced, micro-fabricated silicon nitride (Si3N4) optical waveguide for addressing a chain of eight, unequally-spaced trapped barium ions with crosstalk compatible with scalable quantum information processing. The crosstalk mitigation techniques incorporated into the chip design result in a reduction of the measured optical field by at least 50.8(1.3) dB between adjacent waveguide outputs near 650 nm and similar behavior for devices designed for 493 nm and 585 nm. The waveguide outputs near 650 nm, along with a global laser near 493 nm were used to laser-cool a chain of eight barium-138 ions, and a camera imaged the resulting fluorescence at 493 nm. △ Less

Submitted 27 June, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

Comments: 9 pages, 7 figures

arXiv:2406.15648 [pdf, ps, other]

Testing the Feasibility of Linear Programs with Bandit Feedback

Authors: Aditya Gangrade, Aditya Gopalan, Venkatesh Saligrama, Clayton Scott

Abstract: While the recent literature has seen a surge in the study of constrained bandit problems, all existing methods for these begin by assuming the feasibility of the underlying problem. We initiate the study of testing such feasibility assumptions, and in particular address the problem in the linear bandit setting, thus characterising the costs of feasibility testing for an unknown linear program usin… ▽ More While the recent literature has seen a surge in the study of constrained bandit problems, all existing methods for these begin by assuming the feasibility of the underlying problem. We initiate the study of testing such feasibility assumptions, and in particular address the problem in the linear bandit setting, thus characterising the costs of feasibility testing for an unknown linear program using bandit feedback. Concretely, we test if $\exists x: Ax \ge 0$ for an unknown $A \in \mathbb{R}^{m \times d}$, by playing a sequence of actions $x_t\in \mathbb{R}^d$, and observing $Ax_t + \mathrm{noise}$ in response. By identifying the hypothesis as determining the sign of the value of a minimax game, we construct a novel test based on low-regret algorithms and a nonasymptotic law of iterated logarithms. We prove that this test is reliable, and adapts to the `signal level,' $Γ,$ of any instance, with mean sample costs scaling as $\widetilde{O}(d^2/Γ^2)$. We complement this by a minimax lower bound of $Ω(d/Γ^2)$ for sample costs of reliable tests, dominating prior asymptotic lower bounds by capturing the dependence on $d$, and thus elucidating a basic insight missing in the extant literature on such problems. △ Less

Submitted 21 June, 2024; originally announced June 2024.

Comments: Spotlight presentation at ICML 2024

arXiv:2406.14861 [pdf, other]

Resilience of the Electric Grid through Trustable IoT-Coordinated Assets

Authors: Vineet J. Nair, Venkatesh Venkataramanan, Priyank Srivastava, Partha S. Sarker, Anurag Srivastava, Laurentiu D. Marinovici, Jun Zha, Christopher Irwin, Prateek Mittal, John Williams, H. Vincent Poor, Anuradha M. Annaswamy

Abstract: The electricity grid has evolved from a physical system to a cyber-physical system with digital devices that perform measurement, control, communication, computation, and actuation. The increased penetration of distributed energy resources (DERs) that include renewable generation, flexible loads, and storage provides extraordinary opportunities for improvements in efficiency and sustainability. Ho… ▽ More The electricity grid has evolved from a physical system to a cyber-physical system with digital devices that perform measurement, control, communication, computation, and actuation. The increased penetration of distributed energy resources (DERs) that include renewable generation, flexible loads, and storage provides extraordinary opportunities for improvements in efficiency and sustainability. However, they can introduce new vulnerabilities in the form of cyberattacks, which can cause significant challenges in ensuring grid resilience. %, i.e. the ability to rapidly restore grid services in the face of severe disruptions. We propose a framework in this paper for achieving grid resilience through suitably coordinated assets including a network of Internet of Things (IoT) devices. A local electricity market is proposed to identify trustable assets and carry out this coordination. Situational Awareness (SA) of locally available DERs with the ability to inject power or reduce consumption is enabled by the market, together with a monitoring procedure for their trustability and commitment. With this SA, we show that a variety of cyberattacks can be mitigated using local trustable resources without stressing the bulk grid. The demonstrations are carried out using a variety of platforms with a high-fidelity co-simulation platform, real-time hardware-in-the-loop validation, and a utility-friendly simulator. △ Less

Submitted 21 June, 2024; originally announced June 2024.

Comments: Submitted to the Proceedings of the National Academy of Sciences (PNAS), under review

arXiv:2406.14799 [pdf, other]

Capture Point Control in Thruster-Assisted Bipedal Locomotion

Authors: Shreyansh Pitroda, Aditya Bondada, Kaushik Venkatesh Krishnamurthy, Adarsh Salagame, Chenghao Wang, Taoran Liu, Bibek Gupta, Eric Sihite, Reza Nemovi, Alireza Ramezani, Morteza Gharib

Abstract: Despite major advancements in control design that are robust to unplanned disturbances, bipedal robots are still susceptible to falling over and struggle to negotiate rough terrains. By utilizing thrusters in our bipedal robot, we can perform additional posture manipulation and expand the modes of locomotion to enhance the robot's stability and ability to negotiate rough and difficult-to-navigate… ▽ More Despite major advancements in control design that are robust to unplanned disturbances, bipedal robots are still susceptible to falling over and struggle to negotiate rough terrains. By utilizing thrusters in our bipedal robot, we can perform additional posture manipulation and expand the modes of locomotion to enhance the robot's stability and ability to negotiate rough and difficult-to-navigate terrains. In this paper, we present our efforts in designing a controller based on capture point control for our thruster-assisted walking model named Harpy and explore its control design possibilities. While capture point control based on centroidal models for bipedal systems has been extensively studied, the incorporation of external forces that can influence the dynamics of linear inverted pendulum models, often used in capture point-based works, has not been explored before. The inclusion of these external forces can lead to interesting interpretations of locomotion, such as virtual buoyancy studied in aquatic-legged locomotion. This paper outlines the dynamical model of our robot, the capture point method we use to assist the upper body stabilization, and the simulation work done to show the controller's feasibility. △ Less

Submitted 20 June, 2024; originally announced June 2024.

Comments: Submitted and to be presented at IEEE AIM 2024. arXiv admin note: substantial text overlap with arXiv:2103.15952

arXiv:2406.13411 [pdf, other]

Composite Concept Extraction through Backdooring

Authors: Banibrata Ghosh, Haripriya Harikumar, Khoa D Doan, Svetha Venkatesh, Santu Rana

Abstract: Learning composite concepts, such as \textquotedbl red car\textquotedbl , from individual examples -- like a white car representing the concept of \textquotedbl car\textquotedbl{} and a red strawberry representing the concept of \textquotedbl red\textquotedbl -- is inherently challenging. This paper introduces a novel method called Composite Concept Extractor (CoCE), which leverages techniques fro… ▽ More Learning composite concepts, such as \textquotedbl red car\textquotedbl , from individual examples -- like a white car representing the concept of \textquotedbl car\textquotedbl{} and a red strawberry representing the concept of \textquotedbl red\textquotedbl -- is inherently challenging. This paper introduces a novel method called Composite Concept Extractor (CoCE), which leverages techniques from traditional backdoor attacks to learn these composite concepts in a zero-shot setting, requiring only examples of individual concepts. By repurposing the trigger-based model backdooring mechanism, we create a strategic distortion in the manifold of the target object (e.g., \textquotedbl car\textquotedbl ) induced by example objects with the target property (e.g., \textquotedbl red\textquotedbl ) from objects \textquotedbl red strawberry\textquotedbl , ensuring the distortion selectively affects the target objects with the target property. Contrastive learning is then employed to further refine this distortion, and a method is formulated for detecting objects that are influenced by the distortion. Extensive experiments with in-depth analysis across different datasets demonstrate the utility and applicability of our proposed approach. △ Less

Submitted 21 June, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

arXiv:2406.13118 [pdf, other]

Thruster-Assisted Incline Walking

Authors: Kaushik Venkatesh Krishnamurthy, Chenghao Wang, Shreyansh Pitroda, Adarsh Salagame, Eric Sihite, Reza Nemovi, Alireza Ramezani, Morteza Gharib

Abstract: In this study, our aim is to evaluate the effectiveness of thruster-assisted steep slope walking for the Husky Carbon, a quadrupedal robot equipped with custom-designed actuators and plural electric ducted fans, through simulation prior to conducting experimental trials. Thruster-assisted steep slope walking draws inspiration from wing-assisted incline running (WAIR) observed in birds, and intrigu… ▽ More In this study, our aim is to evaluate the effectiveness of thruster-assisted steep slope walking for the Husky Carbon, a quadrupedal robot equipped with custom-designed actuators and plural electric ducted fans, through simulation prior to conducting experimental trials. Thruster-assisted steep slope walking draws inspiration from wing-assisted incline running (WAIR) observed in birds, and intriguingly incorporates posture manipulation and thrust vectoring, a locomotion technique not previously explored in the animal kingdom. Our approach involves develo** a reduced-order model of the Husky robot, followed by the application of an optimization-based controller utilizing collocation methods and dynamics interpolation to determine control actions. Through simulation testing, we demonstrate the feasibility of hardware implementation of our controller. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 7 pages, 7 figures, submitted to CDC 2024 conference. arXiv admin note: text overlap with arXiv:2405.06070

arXiv:2406.12818 [pdf, other]

Optimal Bailouts in Diversified Financial Networks

Authors: Krishna Dasaratha, Santosh Venkatesh, Rakesh Vohra

Abstract: Widespread default involves substantial deadweight costs which could be countered by injecting capital into failing firms. Injections have positive spillovers that can trigger a repayment cascade. But which firms should a regulator bailout so as to minimize the total injection of capital while ensuring solvency of all firms? While the problem is, in general, NP-hard, for a wide range of networks t… ▽ More Widespread default involves substantial deadweight costs which could be countered by injecting capital into failing firms. Injections have positive spillovers that can trigger a repayment cascade. But which firms should a regulator bailout so as to minimize the total injection of capital while ensuring solvency of all firms? While the problem is, in general, NP-hard, for a wide range of networks that arise from a stochastic block model, we show that the optimal bailout can be implemented by a simple policy that targets firms based on their characteristics and position in the network. Specific examples of the setting include core-periphery networks. △ Less

Submitted 18 June, 2024; originally announced June 2024.

arXiv:2406.10197 [pdf, other]

Crafting Parts for Expressive Object Composition

Authors: Harsh Rangwani, Aishwarya Agarwal, Kuldeep Kulkarni, R. Venkatesh Babu, Srikrishna Karanam

Abstract: Text-to-image generation from large generative models like Stable Diffusion, DALLE-2, etc., have become a common base for various tasks due to their superior quality and extensive knowledge bases. As image composition and generation are creative processes the artists need control over various parts of the images being generated. We find that just adding details about parts in the base text prompt… ▽ More Text-to-image generation from large generative models like Stable Diffusion, DALLE-2, etc., have become a common base for various tasks due to their superior quality and extensive knowledge bases. As image composition and generation are creative processes the artists need control over various parts of the images being generated. We find that just adding details about parts in the base text prompt either leads to an entirely different image (e.g., missing/incorrect identity) or the extra part details simply being ignored. To mitigate these issues, we introduce PartCraft, which enables image generation based on fine-grained part-level details specified for objects in the base text prompt. This allows more control for artists and enables novel object compositions by combining distinctive object parts. PartCraft first localizes object parts by denoising the object region from a specific diffusion process. This enables each part token to be localized to the right object region. After obtaining part masks, we run a localized diffusion process in each of the part regions based on fine-grained part descriptions and combine them to produce the final image. All the stages of PartCraft are based on repurposing a pre-trained diffusion model, which enables it to generalize across various domains without training. We demonstrate the effectiveness of part-level control provided by PartCraft qualitatively through visual examples and quantitatively in comparison to the contemporary baselines. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: Project Page Will Be Here: https://rangwani-harsh.github.io/PartCraft

arXiv:2406.09581 [pdf]

A Review of 315 Benchmark and Test Functions for Machine Learning Optimization Algorithms and Metaheuristics with Mathematical and Visual Descriptions

Authors: M. Z. Naser, Mohammad Khaled al-Bashiti, Arash Teymori Gharah Tapeh, Armin Dadras Eslamlou, Ahmed Naser, Venkatesh Kodur, Rami Hawileeh, Jamal Abdalla, Nima Khodadadi, Amir H. Gandomi

Abstract: In the rapidly evolving optimization and metaheuristics domains, the efficacy of algorithms is crucially determined by the benchmark (test) functions. While several functions have been developed and derived over the past decades, little information is available on the mathematical and visual description, range of suitability, and applications of many such functions. To bridge this knowledge gap, t… ▽ More In the rapidly evolving optimization and metaheuristics domains, the efficacy of algorithms is crucially determined by the benchmark (test) functions. While several functions have been developed and derived over the past decades, little information is available on the mathematical and visual description, range of suitability, and applications of many such functions. To bridge this knowledge gap, this review provides an exhaustive survey of more than 300 benchmark functions used in the evaluation of optimization and metaheuristics algorithms. This review first catalogs benchmark and test functions based on their characteristics, complexity, properties, visuals, and domain implications to offer a wide view that aids in selecting appropriate benchmarks for various algorithmic challenges. This review also lists the 25 most commonly used functions in the open literature and proposes two new, highly dimensional, dynamic and challenging functions that could be used for testing new algorithms. Finally, this review identifies gaps in current benchmarking practices and suggests directions for future research. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2406.05796 [pdf, other]

ProFeAT: Projected Feature Adversarial Training for Self-Supervised Learning of Robust Representations

Authors: Sravanti Addepalli, Priyam Dey, R. Venkatesh Babu

Abstract: The need for abundant labelled data in supervised Adversarial Training (AT) has prompted the use of Self-Supervised Learning (SSL) techniques with AT. However, the direct application of existing SSL methods to adversarial training has been sub-optimal due to the increased training complexity of combining SSL with AT. A recent approach, DeACL, mitigates this by utilizing supervision from a standard… ▽ More The need for abundant labelled data in supervised Adversarial Training (AT) has prompted the use of Self-Supervised Learning (SSL) techniques with AT. However, the direct application of existing SSL methods to adversarial training has been sub-optimal due to the increased training complexity of combining SSL with AT. A recent approach, DeACL, mitigates this by utilizing supervision from a standard SSL teacher in a distillation setting, to mimic supervised AT. However, we find that there is still a large performance gap when compared to supervised adversarial training, specifically on larger models. In this work, investigate the key reason for this gap and propose Projected Feature Adversarial Training (ProFeAT) to bridge the same. We show that the sub-optimal distillation performance is a result of mismatch in training objectives of the teacher and student, and propose to use a projection head at the student, that allows it to leverage weak supervision from the teacher while also being able to learn adversarially robust representations that are distinct from the teacher. We further propose appropriate attack and defense losses at the feature and projector, alongside a combination of weak and strong augmentations for the teacher and student respectively, to improve the training data diversity without increasing the training complexity. Through extensive experiments on several benchmark datasets and models, we demonstrate significant improvements in both clean and robust accuracy when compared to existing SSL-AT methods, setting a new state-of-the-art. We further report on-par/ improved performance when compared to TRADES, a popular supervised-AT method. △ Less

Submitted 9 June, 2024; originally announced June 2024.

arXiv:2406.05494 [pdf, other]

Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation

Authors: Neeraj Varshney, Satyam Raj, Venkatesh Mishra, Agneet Chatterjee, Ritika Sarkar, Amir Saeidi, Chitta Baral

Abstract: Large Language Models (LLMs) have achieved remarkable performance across a wide variety of natural language tasks. However, they have been shown to suffer from a critical limitation pertinent to 'hallucination' in their output. Recent research has focused on investigating and addressing this problem for a variety of tasks such as biography generation, question answering, abstractive summarization,… ▽ More Large Language Models (LLMs) have achieved remarkable performance across a wide variety of natural language tasks. However, they have been shown to suffer from a critical limitation pertinent to 'hallucination' in their output. Recent research has focused on investigating and addressing this problem for a variety of tasks such as biography generation, question answering, abstractive summarization, and dialogue generation. However, the crucial aspect pertaining to 'negation' has remained considerably underexplored. Negation is important because it adds depth and nuance to the understanding of language and is also crucial for logical reasoning and inference. In this work, we address the above limitation and particularly focus on studying the impact of negation in LLM hallucinations. Specifically, we study four tasks with negation: 'false premise completion', 'constrained fact generation', 'multiple choice question answering', and 'fact generation'. We show that open-source state-of-the-art LLMs such as LLaMA-2-chat, Vicuna, and Orca-2 hallucinate considerably on all these tasks involving negation which underlines a critical shortcoming of these models. Addressing this problem, we further study numerous strategies to mitigate these hallucinations and demonstrate their impact. △ Less

Submitted 8 June, 2024; originally announced June 2024.

arXiv:2406.02244 [pdf, ps, other]

On the characterization of chordal graphs using Horn hypergeometric series

Authors: Dipnit Biswas, Irfan Habib, R. Venkatesh

Abstract: In 6, Radchenko and Villegas characterized the chordal graphs by their inverse of the independence polynomials being Horn hypergeometric series. In this paper, we reprove their result using some elementary combinatorial methods and also generalize it to PEO graphs that could have a countable number of vertices. Our proof is different from the proof of 6, and it is based on the connection between t… ▽ More In 6, Radchenko and Villegas characterized the chordal graphs by their inverse of the independence polynomials being Horn hypergeometric series. In this paper, we reprove their result using some elementary combinatorial methods and also generalize it to PEO graphs that could have a countable number of vertices. Our proof is different from the proof of 6, and it is based on the connection between the inverse of the multi-variate independence polynomials and the multi-colored chromatic polynomials of graphs, established in 1. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: 11 pages

MSC Class: 97K30; 05C15; 33C20; 33C70

arXiv:2406.01962 [pdf, other]

Exploring coherent dynamics in resonant x-ray scattering of intense ultrafast pulses

Authors: Akilesh Venkatesh, Phay J. Ho

Abstract: Intense x-ray free-electron lasers (XFELs) offer unique opportunities to control inner-shell electrons on ultrafast timescales. This study presents a theoretical framework for modeling resonant x-ray scattering under intense ultrafast pulses, focusing on the coherent dynamics of Rabi oscillations. We employ a time-dependent Schrödinger equation approach to investigate the effects of high-intensity… ▽ More Intense x-ray free-electron lasers (XFELs) offer unique opportunities to control inner-shell electrons on ultrafast timescales. This study presents a theoretical framework for modeling resonant x-ray scattering under intense ultrafast pulses, focusing on the coherent dynamics of Rabi oscillations. We employ a time-dependent Schrödinger equation approach to investigate the effects of high-intensity pulses on the single atom response which includes resonant fluorescence and elastic scattering channels, and competing decay processes. Our findings highlight the sensitivity of scattering responses to pulse parameters and initial states, with interference effects playing a significant role. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: 17 pages, 10 Figures

arXiv:2405.19303 [pdf, other]

Morse Theory for Chromatic Delaunay Triangulations

Authors: Abhinav Natarajan, Thomas Chaplin, Adam Brown, Maria-Jose Jimenez

Abstract: The chromatic alpha filtration is a generalization of the alpha filtration that can encode spatial relationships among classes of labelled point cloud data, and has applications in topological data analysis of multi-species data. In this paper we introduce the chromatic Delaunay--Čech and chromatic Delaunay--Rips filtrations, which are computationally favourable alternatives to the chromatic alpha… ▽ More The chromatic alpha filtration is a generalization of the alpha filtration that can encode spatial relationships among classes of labelled point cloud data, and has applications in topological data analysis of multi-species data. In this paper we introduce the chromatic Delaunay--Čech and chromatic Delaunay--Rips filtrations, which are computationally favourable alternatives to the chromatic alpha filtration. We use generalized discrete Morse theory to show that the Čech, chromatic Delaunay--Čech, and chromatic alpha filtrations are related by simplicial collapses. Our result generalizes a result of Bauer and Edelsbrunner from the non-chromatic to the chromatic setting. We also show that the chromatic Delaunay--Rips filtration is locally stable to perturbations of the underlying point cloud. Our results provide theoretical justification for the use of chromatic Delaunay--Čech and chromatic Delaunay--Rips filtrations in applications, and we demonstrate their computational advantage with numerical experiments. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 46 pages, 11 figures

MSC Class: 55N31 (Primary) 52-08 (Secondary)

arXiv:2405.18212 [pdf, ps, other]

Some Singular Examples of Relative Langlands Duality

Authors: Eric Y. Chen, Akshay Venkatesh

Abstract: Relative Langlands duality structures the study of automorphic periods around a putative duality between certain group actions of Langlands dual reductive groups. In this article, after giving a self-contained exposition of the relevant ingredients from relative Langlands duality, we examine this proposal for some interesting pairs of singular spaces: one pair arising from the cone of nilpotent… ▽ More Relative Langlands duality structures the study of automorphic periods around a putative duality between certain group actions of Langlands dual reductive groups. In this article, after giving a self-contained exposition of the relevant ingredients from relative Langlands duality, we examine this proposal for some interesting pairs of singular spaces: one pair arising from the cone of nilpotent (3 x 3)-matrices, and the other pair arising from the nilpotent cone of (2,2,2)-tensors. These relate, respectively, to Rankin--Selberg integrals discovered by Ginzburg and Garrett. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2405.16388 [pdf, other]

Multi-Reference Preference Optimization for Large Language Models

Authors: Hung Le, Quan Tran, Dung Nguyen, Kien Do, Saloni Mittal, Kelechi Ogueji, Svetha Venkatesh

Abstract: How can Large Language Models (LLMs) be aligned with human intentions and values? A typical solution is to gather human preference on model outputs and finetune the LLMs accordingly while ensuring that updates do not deviate too far from a reference model. Recent approaches, such as direct preference optimization (DPO), have eliminated the need for unstable and sluggish reinforcement learning opti… ▽ More How can Large Language Models (LLMs) be aligned with human intentions and values? A typical solution is to gather human preference on model outputs and finetune the LLMs accordingly while ensuring that updates do not deviate too far from a reference model. Recent approaches, such as direct preference optimization (DPO), have eliminated the need for unstable and sluggish reinforcement learning optimization by introducing close-formed supervised losses. However, a significant limitation of the current approach is its design for a single reference model only, neglecting to leverage the collective power of numerous pretrained LLMs. To overcome this limitation, we introduce a novel closed-form formulation for direct preference optimization using multiple reference models. The resulting algorithm, Multi-Reference Preference Optimization (MRPO), leverages broader prior knowledge from diverse reference models, substantially enhancing preference learning capabilities compared to the single-reference DPO. Our experiments demonstrate that LLMs finetuned with MRPO generalize better in various preference data, regardless of data scarcity or abundance. Furthermore, MRPO effectively finetunes LLMs to exhibit superior performance in several downstream natural language processing tasks such as GSM8K and TruthfulQA. △ Less

Submitted 25 May, 2024; originally announced May 2024.

Comments: 20 pages

arXiv:2405.15254 [pdf, other]

Novel Kernel Models and Exact Representor Theory for Neural Networks Beyond the Over-Parameterized Regime

Authors: Alistair Shilton, Sunil Gupta, Santu Rana, Svetha Venkatesh

Abstract: This paper presents two models of neural-networks and their training applicable to neural networks of arbitrary width, depth and topology, assuming only finite-energy neural activations; and a novel representor theory for neural networks in terms of a matrix-valued kernel. The first model is exact (un-approximated) and global, casting the neural network as an elements in a reproducing kernel Banac… ▽ More This paper presents two models of neural-networks and their training applicable to neural networks of arbitrary width, depth and topology, assuming only finite-energy neural activations; and a novel representor theory for neural networks in terms of a matrix-valued kernel. The first model is exact (un-approximated) and global, casting the neural network as an elements in a reproducing kernel Banach space (RKBS); we use this model to provide tight bounds on Rademacher complexity. The second model is exact and local, casting the change in neural network function resulting from a bounded change in weights and biases (ie. a training step) in reproducing kernel Hilbert space (RKHS) in terms of a local-intrinsic neural kernel (LiNK). This local model provides insight into model adaptation through tight bounds on Rademacher complexity of network adaptation. We also prove that the neural tangent kernel (NTK) is a first-order approximation of the LiNK kernel. Finally, and noting that the LiNK does not provide a representor theory for technical reasons, we present an exact novel representor theory for layer-wise neural network training with unregularized gradient descent in terms of a local-extrinsic neural kernel (LeNK). This representor theory gives insight into the role of higher-order statistics in neural network training and the effect of kernel evolution in neural-network kernel models. Throughout the paper (a) feedforward ReLU networks and (b) residual networks (ResNet) are used as illustrative examples. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2405.14405 [pdf, other]

Qubit-efficient Variational Quantum Algorithms for Image Segmentation

Authors: Supreeth Mysore Venkatesh, Antonio Macaluso, Marlon Nuske, Matthias Klusch, Andreas Dengel

Abstract: Quantum computing is expected to transform a range of computational tasks beyond the reach of classical algorithms. In this work, we examine the application of variational quantum algorithms (VQAs) for unsupervised image segmentation to partition images into separate semantic regions. Specifically, we formulate the task as a graph cut optimization problem and employ two established qubit-efficient… ▽ More Quantum computing is expected to transform a range of computational tasks beyond the reach of classical algorithms. In this work, we examine the application of variational quantum algorithms (VQAs) for unsupervised image segmentation to partition images into separate semantic regions. Specifically, we formulate the task as a graph cut optimization problem and employ two established qubit-efficient VQAs, which we refer to as Parametric Gate Encoding (PGE) and Ancilla Basis Encoding (ABE), to find the optimal segmentation mask. In addition, we propose Adaptive Cost Encoding (ACE), a new approach that leverages the same circuit architecture as ABE but adopts a problem-dependent cost function. We benchmark PGE, ABE and ACE on synthetically generated images, focusing on quality and trainability. ACE shows consistently faster convergence in training the parameterized quantum circuits in comparison to PGE and ABE. Furthermore, we provide a theoretical analysis of the scalability of these approaches against the Quantum Approximate Optimization Algorithm (QAOA), showing a significant cutback in the quantum resources, especially in the number of qubits that logarithmically depends on the number of pixels. The results validate the strengths of ACE, while concurrently highlighting its inherent limitations and challenges. This paves way for further research in quantum-enhanced computer vision. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: 7 pages, 4 figures, 2 tables

arXiv:2405.10556 [pdf, other]

Parameterized Complexity of Dominating Set Variants in Almost Cluster and Split Graphs

Authors: Dishant Goyal, Ashwin Jacob, Kaushtubh Kumar, Diptapriyo Majumdar, Venkatesh Raman

Abstract: We consider structural parameterizations of the fundamental Dominating Set problem and its variants in the parameter ecology program. We give improved FPT algorithms and lower bounds under well-known conjectures for dominating set in graphs that are k vertices away from a cluster graph or a split graph. These are graphs in which there is a set of k vertices (called the modulator) whose deletion re… ▽ More We consider structural parameterizations of the fundamental Dominating Set problem and its variants in the parameter ecology program. We give improved FPT algorithms and lower bounds under well-known conjectures for dominating set in graphs that are k vertices away from a cluster graph or a split graph. These are graphs in which there is a set of k vertices (called the modulator) whose deletion results in a cluster graph or a split graph. We also call k as the deletion distance (to the appropriate class of graphs). When parameterized by the deletion distance k to cluster graphs - we can find a minimum dominating set (DS) in 3^k n^{O(1)}-time. Within the same time, we can also find a minimum independent dominating set (IDS) or a minimum dominating clique (DC) or a minimum efficient dominating set (EDS) or a minimum total dominating set (TDS). We also show that most of these variants of dominating set do not have polynomial sized kernel. Additionally, we show that when parameterized by the deletion distance k to split graphs - IDS can be solved in 2^k n^{O(1)}-time and EDS can be solved in 3^{k/2}n^{O(1)}. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: Some of the results appeared in proceedings of CSR 2018

arXiv:2405.06070 [pdf, other]

Narrow-Path, Dynamic Walking Using Integrated Posture Manipulation and Thrust Vectoring

Authors: Kaushik Venkatesh Krishnamurthy, Chenghao Wang, Shreyansh Pitroda, Adarsh Salagame, Eric Sihite, Reza Nemovi, Alireza Ramezani, Morteza Gharib

Abstract: This research concentrates on enhancing the navigational capabilities of Northeastern Universitys Husky, a multi-modal quadrupedal robot, that can integrate posture manipulation and thrust vectoring, to traverse through narrow pathways such as walking over pipes and slacklining. The Husky is outfitted with thrusters designed to stabilize its body during dynamic walking over these narrow paths. The… ▽ More This research concentrates on enhancing the navigational capabilities of Northeastern Universitys Husky, a multi-modal quadrupedal robot, that can integrate posture manipulation and thrust vectoring, to traverse through narrow pathways such as walking over pipes and slacklining. The Husky is outfitted with thrusters designed to stabilize its body during dynamic walking over these narrow paths. The project involves modeling the robot using the HROM (Husky Reduced Order Model) and develo** an optimal control framework. This framework is based on polynomial approximation of the HROM and a collocation approach to derive optimal thruster commands necessary for achieving dynamic walking on narrow paths. The effectiveness of the modeling and control design approach is validated through simulations conducted using Matlab. △ Less

Submitted 9 May, 2024; originally announced May 2024.

Comments: arXiv admin note: text overlap with arXiv:2312.12586

arXiv:2405.04389 [pdf, ps, other]

Triangulated characterizations of singularities

Authors: Pat Lank, Sridhar Venkatesh

Abstract: This work presents a range of triangulated characterizations for important classes of singularities such as derived splinters, rational singularities, and Du Bois singularities. An invariant called 'level' in a triangulated category can be used to measure the failure of a variety to have a prescribed singularity type. We provide explicit computations of this invariant for reduced Nagata schemes of… ▽ More This work presents a range of triangulated characterizations for important classes of singularities such as derived splinters, rational singularities, and Du Bois singularities. An invariant called 'level' in a triangulated category can be used to measure the failure of a variety to have a prescribed singularity type. We provide explicit computations of this invariant for reduced Nagata schemes of Krull dimension one and for affine cones over smooth projective hypersurfaces. Furthermore, these computations are utilized to produce upper bounds for Rouquier dimension on the respective bounded derived categories. △ Less

Submitted 10 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

Comments: Current: Removed properness assumptions, removed Section 4, improved exposition. Previous: Initial version

MSC Class: 14F08 (primary); 14B05 (secondary); 14F17; 14A30; 14E15; 18G80

arXiv:2405.03727 [pdf, other]

Large Language Models Synergize with Automated Machine Learning

Authors: **glue Xu, Jialong Li, Zhen Liu, Nagar Anthel Venkatesh Suryanarayanan, Guoyuan Zhou, Jia Guo, Hitoshi Iba, Kenji Tei

Abstract: Recently, program synthesis driven by large language models (LLMs) has become increasingly popular. However, program synthesis for machine learning (ML) tasks still poses significant challenges. This paper explores a novel form of program synthesis, targeting ML programs, by combining LLMs and automated machine learning (autoML). Specifically, our goal is to fully automate the generation and optim… ▽ More Recently, program synthesis driven by large language models (LLMs) has become increasingly popular. However, program synthesis for machine learning (ML) tasks still poses significant challenges. This paper explores a novel form of program synthesis, targeting ML programs, by combining LLMs and automated machine learning (autoML). Specifically, our goal is to fully automate the generation and optimization of the code of the entire ML workflow, from data preparation to modeling and post-processing, utilizing only textual descriptions of the ML tasks. To manage the length and diversity of ML programs, we propose to break each ML program into smaller, manageable parts. Each part is generated separately by the LLM, with careful consideration of their compatibilities. To ensure compatibilities, we design a testing technique for ML programs. Unlike traditional program synthesis, which typically relies on binary evaluations (i.e., correct or incorrect), evaluating ML programs necessitates more than just binary judgments. Therefore, we further assess ML programs numerically and select the optimal programs from a range of candidates using AutoML methods. In experiments across various ML tasks, our method outperforms existing methods in 10 out of 12 tasks for generating ML programs. In addition, autoML significantly improves the performance of the generated ML programs. In experiments, given the textual task description, our method, Text-to-ML, generates the complete and optimized ML program in a fully autonomous process. △ Less

Submitted 11 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

arXiv:2405.03602 [pdf, other]

One nose but two nostrils: Learn to align with sparse connections between two olfactory cortices

Authors: Bo Liu, Shanshan Qin, Venkatesh Murthy, Yuhai Tu

Abstract: The integration of neural representations in the two hemispheres is an important problem in neuroscience. Recent experiments revealed that odor responses in cortical neurons driven by separate stimulation of the two nostrils are highly correlated. This bilateral alignment points to structured inter-hemispheric connections, but detailed mechanism remains unclear. Here, we hypothesized that continuo… ▽ More The integration of neural representations in the two hemispheres is an important problem in neuroscience. Recent experiments revealed that odor responses in cortical neurons driven by separate stimulation of the two nostrils are highly correlated. This bilateral alignment points to structured inter-hemispheric connections, but detailed mechanism remains unclear. Here, we hypothesized that continuous exposure to environmental odors shapes these projections and modeled it as online learning with local Hebbian rule. We found that Hebbian learning with sparse connections achieves bilateral alignment, exhibiting a linear trade-off between speed and accuracy. We identified an inverse scaling relationship between the number of cortical neurons and the inter-hemispheric projection density required for desired alignment accuracy, i.e., more cortical neurons allow sparser inter-hemispheric projections. We next compared the alignment performance of local Hebbian rule and the global stochastic-gradient-descent (SGD) learning for artificial neural networks. We found that although SGD leads to the same alignment accuracy with modestly sparser connectivity, the same inverse scaling relation holds. We showed that their similar performance originates from the fact that the update vectors of the two learning rules align significantly throughout the learning process. This insight may inspire efficient sparse local learning algorithms for more complex problems. △ Less

Submitted 6 May, 2024; originally announced May 2024.

arXiv:2405.01310 [pdf, other]

Overcoming LLM Challenges using RAG-Driven Precision in Coffee Leaf Disease Remediation

Authors: Dr. Selva Kumar S, Afifah Khan Mohammed Ajmal Khan, Imadh Ajaz Banday, Manikantha Gada, Vibha Venkatesh Shanbhag

Abstract: This research introduces an innovative AI-driven precision agriculture system, leveraging YOLOv8 for disease identification and Retrieval Augmented Generation (RAG) for context-aware diagnosis. Focused on addressing the challenges of diseases affecting the coffee production sector in Karnataka, The system integrates sophisticated object detection techniques with language models to address the inhe… ▽ More This research introduces an innovative AI-driven precision agriculture system, leveraging YOLOv8 for disease identification and Retrieval Augmented Generation (RAG) for context-aware diagnosis. Focused on addressing the challenges of diseases affecting the coffee production sector in Karnataka, The system integrates sophisticated object detection techniques with language models to address the inherent constraints associated with Large Language Models (LLMs). Our methodology not only tackles the issue of hallucinations in LLMs, but also introduces dynamic disease identification and remediation strategies. Real-time monitoring, collaborative dataset expansion, and organizational involvement ensure the system's adaptability in diverse agricultural settings. The effect of the suggested system extends beyond automation, aiming to secure food supplies, protect livelihoods, and promote eco-friendly farming practices. By facilitating precise disease identification, the system contributes to sustainable and environmentally conscious agriculture, reducing reliance on pesticides. Looking to the future, the project envisions continuous development in RAG-integrated object detection systems, emphasizing scalability, reliability, and usability. This research strives to be a beacon for positive change in agriculture, aligning with global efforts toward sustainable and technologically enhanced food production. △ Less

Submitted 2 May, 2024; originally announced May 2024.

Comments: 6 pages, 3 figures

arXiv:2405.01156 [pdf, other]

Self-Supervised Learning for Interventional Image Analytics: Towards Robust Device Trackers

Authors: Saahil Islam, Venkatesh N. Murthy, Dominik Neumann, Badhan Kumar Das, Puneet Sharma, Andreas Maier, Dorin Comaniciu, Florin C. Ghesu

Abstract: An accurate detection and tracking of devices such as guiding catheters in live X-ray image acquisitions is an essential prerequisite for endovascular cardiac interventions. This information is leveraged for procedural guidance, e.g., directing stent placements. To ensure procedural safety and efficacy, there is a need for high robustness no failures during tracking. To achieve that, one needs to… ▽ More An accurate detection and tracking of devices such as guiding catheters in live X-ray image acquisitions is an essential prerequisite for endovascular cardiac interventions. This information is leveraged for procedural guidance, e.g., directing stent placements. To ensure procedural safety and efficacy, there is a need for high robustness no failures during tracking. To achieve that, one needs to efficiently tackle challenges, such as: device obscuration by contrast agent or other external devices or wires, changes in field-of-view or acquisition angle, as well as the continuous movement due to cardiac and respiratory motion. To overcome the aforementioned challenges, we propose a novel approach to learn spatio-temporal features from a very large data cohort of over 16 million interventional X-ray frames using self-supervision for image sequence data. Our approach is based on a masked image modeling technique that leverages frame interpolation based reconstruction to learn fine inter-frame temporal correspondences. The features encoded in the resulting model are fine-tuned downstream. Our approach achieves state-of-the-art performance and in particular robustness compared to ultra optimized reference solutions (that use multi-stage feature fusion, multi-task and flow regularization). The experiments show that our method achieves 66.31% reduction in maximum tracking error against reference solutions (23.20% when flow regularization is used); achieving a success score of 97.95% at a 3x faster inference speed of 42 frames-per-second (on GPU). The results encourage the use of our approach in various other tasks within interventional image analytics that require effective understanding of spatio-temporal semantics. △ Less

Submitted 2 May, 2024; originally announced May 2024.

arXiv:2404.19668 [pdf, other]

SQUAT: Stateful Quantization-Aware Training in Recurrent Spiking Neural Networks

Authors: Sreyes Venkatesh, Razvan Marinescu, Jason K. Eshraghian

Abstract: Weight quantization is used to deploy high-performance deep learning models on resource-limited hardware, enabling the use of low-precision integers for storage and computation. Spiking neural networks (SNNs) share the goal of enhancing efficiency, but adopt an 'event-driven' approach to reduce the power consumption of neural network inference. While extensive research has focused on weight quanti… ▽ More Weight quantization is used to deploy high-performance deep learning models on resource-limited hardware, enabling the use of low-precision integers for storage and computation. Spiking neural networks (SNNs) share the goal of enhancing efficiency, but adopt an 'event-driven' approach to reduce the power consumption of neural network inference. While extensive research has focused on weight quantization, quantization-aware training (QAT), and their application to SNNs, the precision reduction of state variables during training has been largely overlooked, potentially diminishing inference performance. This paper introduces two QAT schemes for stateful neurons: (i) a uniform quantization strategy, an established method for weight quantization, and (ii) threshold-centered quantization, which allocates exponentially more quantization levels near the firing threshold. Our results show that increasing the density of quantization levels around the firing threshold improves accuracy across several benchmark datasets. We provide an ablation analysis of the effects of weight and state quantization, both individually and combined, and how they impact models. Our comprehensive empirical evaluation includes full precision, 8-bit, 4-bit, and 2-bit quantized SNNs, using QAT, stateful QAT (SQUAT), and post-training quantization methods. The findings indicate that the combination of QAT and SQUAT enhance performance the most, but given the choice of one or the other, QAT improves performance by the larger degree. These trends are consistent all datasets. Our methods have been made available in our Python library snnTorch: https://github.com/jeshraghian/snntorch. △ Less

Submitted 14 April, 2024; originally announced April 2024.

Comments: 10 pages, 4 figures, accepted at NICE 2024

arXiv:2404.18963 [pdf, other]

RE-GrievanceAssist: Enhancing Customer Experience through ML-Powered Complaint Management

Authors: Venkatesh C, Harshit Oberoi, Anurag Kumar Pandey, Anil Goyal, Nikhil Sikka

Abstract: In recent years, digital platform companies have faced increasing challenges in managing customer complaints, driven by widespread consumer adoption. This paper introduces an end-to-end pipeline, named RE-GrievanceAssist, designed specifically for real estate customer complaint management. The pipeline consists of three key components: i) response/no-response ML model using TF-IDF vectorization an… ▽ More In recent years, digital platform companies have faced increasing challenges in managing customer complaints, driven by widespread consumer adoption. This paper introduces an end-to-end pipeline, named RE-GrievanceAssist, designed specifically for real estate customer complaint management. The pipeline consists of three key components: i) response/no-response ML model using TF-IDF vectorization and XGBoost classifier ; ii) user type classifier using fasttext classifier; iii) issue/sub-issue classifier using TF-IDF vectorization and XGBoost classifier. Finally, it has been deployed as a batch job in Databricks, resulting in a remarkable 40% reduction in overall manual effort with monthly cost reduction of Rs 1,50,000 since August 2023. △ Less

Submitted 29 April, 2024; originally announced April 2024.

arXiv:2404.16553 [pdf, other]

doi 10.1145/3632410.3632487

RE-RecSys: An End-to-End system for recommending properties in Real-Estate domain

Authors: Venkatesh C, Harshit Oberoi, Anil Goyal, Nikhil Sikka

Abstract: We propose an end-to-end real-estate recommendation system, RE-RecSys, which has been productionized in real-world industry setting. We categorize any user into 4 categories based on available historical data: i) cold-start users; ii) short-term users; iii) long-term users; and iv) short-long term users. For cold-start users, we propose a novel rule-based engine that is based on the popularity of… ▽ More We propose an end-to-end real-estate recommendation system, RE-RecSys, which has been productionized in real-world industry setting. We categorize any user into 4 categories based on available historical data: i) cold-start users; ii) short-term users; iii) long-term users; and iv) short-long term users. For cold-start users, we propose a novel rule-based engine that is based on the popularity of locality and user preferences. For short-term users, we propose to use content-filtering model which recommends properties based on recent interactions of users. For long-term and short-long term users, we propose a novel combination of content and collaborative filtering based approach which can be easily productionized in the real-world scenario. Moreover, based on the conversion rate, we have designed a novel weighing scheme for different impressions done by users on the platform for the training of content and collaborative models. Finally, we show the efficiency of the proposed pipeline, RE-RecSys, on a real-world property and clickstream dataset collected from leading real-estate platform in India. We show that the proposed pipeline is deployable in real-world scenario with an average latency of <40 ms serving 1000 rpm. △ Less

Submitted 25 April, 2024; originally announced April 2024.

arXiv:2404.14847 [pdf]

High-order harmonic generation from laser induced plasma comprising CdSe/V2O5 Core/Shell quantum dots embedded on MoS2 nanosheets

Authors: Srinivasa Rao Konda, Puspendu Barik, Subshash Singh, Venkatesh Mottamchetty, Amit Srivasthava, Vyacheslav V. Kim, Rashid A. Ganeev, Chunlei Guo, Wei Li

Abstract: Research of the nonlinear optical characteristics of transition metal dichalcogenides in the presence of photoactive particles, plasmonic nanocavities, waveguides, and metamaterials is still in its early stages. This investigation delves into the high-order harmonic generation (HHG) from laser induced plasma of MoS2 nanosheets in the presence of semiconductor photoactive medium such as CdSe and Cd… ▽ More Research of the nonlinear optical characteristics of transition metal dichalcogenides in the presence of photoactive particles, plasmonic nanocavities, waveguides, and metamaterials is still in its early stages. This investigation delves into the high-order harmonic generation (HHG) from laser induced plasma of MoS2 nanosheets in the presence of semiconductor photoactive medium such as CdSe and CdSe/V2O5 core/shell quantum dots. Our comprehensive findings shed light on the counteractive coupling impact of both bare and passivated quantum dots on MoS2 nanosheets, as evidenced by the emission of higher-order harmonics. Significantly, the intensity of harmonics and their cut-off were notably enhanced in the MoS2-CdSe and MoS2-V-CdSe configurations compared to pristine MoS2 nanosheets. These advancements hold promise for applications requiring the emission of coherent short-wavelength radiation. △ Less

Submitted 23 April, 2024; originally announced April 2024.

Comments: 8 pages, 4 figures

arXiv:2404.13308 [pdf, ps, other]

ABACUS: An Impairment Aware Joint Optimal Dynamic RMLSA in Elastic Optical Networks

Authors: M Jyothi Kiran, Venkatesh Chebolu, Goutam Das, Raja Datta

Abstract: The challenge of optimal Routing and Spectrum Assignment (RSA) is significant in Elastic Optical Networks. Integrating adaptive modulation formats into the RSA problem - Routing, Modulation Level, and Spectrum Assignment - broadens allocation options and increases complexity. The conventional RSA approach entails predetermining fixed paths and then allocating spectrum within them separately. Howev… ▽ More The challenge of optimal Routing and Spectrum Assignment (RSA) is significant in Elastic Optical Networks. Integrating adaptive modulation formats into the RSA problem - Routing, Modulation Level, and Spectrum Assignment - broadens allocation options and increases complexity. The conventional RSA approach entails predetermining fixed paths and then allocating spectrum within them separately. However, expanding the path set for optimality may not be advisable due to the substantial increase in paths with network size expansion. This paper delves into a novel approach called RMLSA, which proposes a comprehensive solution addressing both route determination and spectrum assignment simultaneously. An objective function named ABACUS, Adaptive Balance of Average Clustering and Utilization of Spectrum, is chosen for its capability to adjust and assign significance to average clustering and spectrum utilization. Our approach involves formulating an Integer Linear Programming model with a straightforward relationship between path and spectrum constraints. The model also integrates Physical Layer Impairments to ensure end-to-end Quality of Transmission for requested connections while maintaining existing ones. We demonstrate that ILP can offer an optimal solution for a dynamic traffic scenario within a reasonable time complexity. To achieve this goal, we adopt a structured formulation approach where essential information is determined beforehand, thus minimizing the need for online computations. △ Less

Submitted 20 April, 2024; originally announced April 2024.

arXiv:2404.12680 [pdf, other]

VoxAtnNet: A 3D Point Clouds Convolutional Neural Network for Generalizable Face Presentation Attack Detection

Authors: Raghavendra Ramachandra, Narayan Vetrekar, Sushma Venkatesh, Savita Nageshker, Jag Mohan Singh, R. S. Gad

Abstract: Facial biometrics are an essential components of smartphones to ensure reliable and trustworthy authentication. However, face biometric systems are vulnerable to Presentation Attacks (PAs), and the availability of more sophisticated presentation attack instruments such as 3D silicone face masks will allow attackers to deceive face recognition systems easily. In this work, we propose a novel Presen… ▽ More Facial biometrics are an essential components of smartphones to ensure reliable and trustworthy authentication. However, face biometric systems are vulnerable to Presentation Attacks (PAs), and the availability of more sophisticated presentation attack instruments such as 3D silicone face masks will allow attackers to deceive face recognition systems easily. In this work, we propose a novel Presentation Attack Detection (PAD) algorithm based on 3D point clouds captured using the frontal camera of a smartphone to detect presentation attacks. The proposed PAD algorithm, VoxAtnNet, processes 3D point clouds to obtain voxelization to preserve the spatial structure. Then, the voxelized 3D samples were trained using the novel convolutional attention network to detect PAs on the smartphone. Extensive experiments were carried out on the newly constructed 3D face point cloud dataset comprising bona fide and two different 3D PAIs (3D silicone face mask and wrap photo mask), resulting in 3480 samples. The performance of the proposed method was compared with existing methods to benchmark the detection performance using three different evaluation protocols. The experimental results demonstrate the improved performance of the proposed method in detecting both known and unknown face presentation attacks. △ Less

Submitted 19 April, 2024; originally announced April 2024.

Comments: Accepted in 2024 18th International Conference on Automatic Face and Gesture Recognition (FG)

arXiv:2404.12561 [pdf]

Charge transfer mechanism on MoS$_2$ nanosheets in the presence of a semiconductor photoactive media

Authors: Srinivasa Rao Konda, Puspendu Barik, Subshash Singh, Venkatesh Mottamchetty, Amit Srivasthava, Rashid A. Ganeev, Soma Venugopal Rao, Chunlei Guo, Wei Li

Abstract: The studies of the nonlinear optical (NLO) properties of the transition metal dichalcogenides (TMDs) coupled with photoactive particles, plasmonic nanocavities, waveguides, and metamaterials remain in their infancy. This study investigates the third-order NLO properties of MoS$_2$ nanosheets in the presence of a semiconductor photoactive medium. Our extensive studies and the obtained results revea… ▽ More The studies of the nonlinear optical (NLO) properties of the transition metal dichalcogenides (TMDs) coupled with photoactive particles, plasmonic nanocavities, waveguides, and metamaterials remain in their infancy. This study investigates the third-order NLO properties of MoS$_2$ nanosheets in the presence of a semiconductor photoactive medium. Our extensive studies and the obtained results reveal the counteractive coupling effect of bare and passivated quantum dots on the MoS$_2$ nanosheet, as made evident by the analysis of the NLO processes. The enhanced NLO properties of MoS$_2$ nanosheets functionalized with CdSe and CdSe-V2O5 quantum dots are helpful for applications as saturable absorbers in laser applications and the emission of coherent short-wavelength radiation. The multiphoton-excitation resonance energy transfer mechanism exploiting remote dipole dipole coupling, and ultrafast charge transfer pathways emerges as another plausible way to alter the NLO properties in TMDs. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: 16 pages, 4 figures

arXiv:2404.11870 [pdf, ps, other]

Enhancing Length Extrapolation in Sequential Models with Pointer-Augmented Neural Memory

Authors: Hung Le, Dung Nguyen, Kien Do, Svetha Venkatesh, Truyen Tran

Abstract: We propose Pointer-Augmented Neural Memory (PANM) to help neural networks understand and apply symbol processing to new, longer sequences of data. PANM integrates an external neural memory that uses novel physical addresses and pointer manipulation techniques to mimic human and computer symbol processing abilities. PANM facilitates pointer assignment, dereference, and arithmetic by explicitly usin… ▽ More We propose Pointer-Augmented Neural Memory (PANM) to help neural networks understand and apply symbol processing to new, longer sequences of data. PANM integrates an external neural memory that uses novel physical addresses and pointer manipulation techniques to mimic human and computer symbol processing abilities. PANM facilitates pointer assignment, dereference, and arithmetic by explicitly using physical pointers to access memory content. Remarkably, it can learn to perform these operations through end-to-end training on sequence data, powering various sequential models. Our experiments demonstrate PANM's exceptional length extrapolating capabilities and improved performance in tasks that require symbol processing, such as algorithmic reasoning and Dyck language recognition. PANM helps Transformer achieve up to 100% generalization accuracy in compositional learning tasks and significantly better results in mathematical reasoning, question answering and machine translation tasks. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: Preprint

arXiv:2404.07148 [pdf, other]

How Consistent are Clinicians? Evaluating the Predictability of Sepsis Disease Progression with Dynamics Models

Authors: Unnseo Park, Venkatesh Sivaraman, Adam Perer

Abstract: Reinforcement learning (RL) is a promising approach to generate treatment policies for sepsis patients in intensive care. While retrospective evaluation metrics show decreased mortality when these policies are followed, studies with clinicians suggest their recommendations are often spurious. We propose that these shortcomings may be due to lack of diversity in observed actions and outcomes in the… ▽ More Reinforcement learning (RL) is a promising approach to generate treatment policies for sepsis patients in intensive care. While retrospective evaluation metrics show decreased mortality when these policies are followed, studies with clinicians suggest their recommendations are often spurious. We propose that these shortcomings may be due to lack of diversity in observed actions and outcomes in the training data, and we construct experiments to investigate the feasibility of predicting sepsis disease severity changes due to clinician actions. Preliminary results suggest incorporating action information does not significantly improve model performance, indicating that clinician actions may not be sufficiently variable to yield measurable effects on disease progression. We discuss the implications of these findings for optimizing sepsis treatment. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: 6 pages, 3 figures; accepted workshop paper at Time Series for Health @ ICLR 2024

arXiv:2404.04767 [pdf, ps, other]

The intersection cohomology Hodge module of toric varieties

Authors: Hyunsuk Kim, Sridhar Venkatesh

Abstract: We study the Hodge filtration of the intersection cohomology Hodge module for toric varieties. More precisely, we study the cohomology sheaves of the graded de Rham complex of the intersection cohomology Hodge module and give a precise formula relating it with the stalks of the intersection cohomology as a constructible complex. The main idea is to use the Ishida complex in order to compute the hi… ▽ More We study the Hodge filtration of the intersection cohomology Hodge module for toric varieties. More precisely, we study the cohomology sheaves of the graded de Rham complex of the intersection cohomology Hodge module and give a precise formula relating it with the stalks of the intersection cohomology as a constructible complex. The main idea is to use the Ishida complex in order to compute the higher direct images of the sheaf of reflexive differentials. △ Less

Submitted 22 May, 2024; v1 submitted 6 April, 2024; originally announced April 2024.

Comments: 24 pages, minor changes

MSC Class: 14B05; 14C30; 14F10; 14M25; 14Q99; 32S35; 52B22

arXiv:2404.03155 [pdf, other]

TEGRA -- Scaling Up Terascale Graph Processing with Disaggregated Computing

Authors: William Shaddix, Mahyar Samani, Marjan Fariborz, S. J. Ben Yoo, Jason Lowe-Power, Venkatesh Akella

Abstract: Graphs are essential for representing relationships in various domains, driving modern AI applications such as graph analytics and neural networks across science, engineering, cybersecurity, transportation, and economics. However, the size of modern graphs are rapidly expanding, posing challenges for traditional CPUs and GPUs in meeting real-time processing demands. As a result, hardware accelerat… ▽ More Graphs are essential for representing relationships in various domains, driving modern AI applications such as graph analytics and neural networks across science, engineering, cybersecurity, transportation, and economics. However, the size of modern graphs are rapidly expanding, posing challenges for traditional CPUs and GPUs in meeting real-time processing demands. As a result, hardware accelerators for graph processing have been proposed. However, the largest graphs that can be handled by these systems is still modest often targeting Twitter graph(1.4B edges approximately). This paper aims to address this limitation by develo** a graph accelerator capable of terascale graph processing. Scale out architectures, architectures where nodes are replicated to expand to larger datasets, are natural for handling larger graphs. We argue that this approach is not appropriate for very large-scale graphs because it leads to under utilization of both memory resources and compute resources. Additionally, vertex and edge processing have different access patterns. Communication overheads also pose further challenges in designing scalable architectures. To overcome these issues, this paper proposes TEGRA, a scale-up architecture for terascale graph processing. TEGRA leverages a composable computing system with disaggregated resources and a communication architecture inspired by Active Messages. By employing direct communication between cores and optimizing memory interconnect utilization, TEGRA effectively reduces communication overhead and improves resource utilization, therefore enabling efficient processing of terascale graphs. △ Less

Submitted 3 April, 2024; originally announced April 2024.

Comments: Presented at the 3rd Workshop on Heterogeneous Composable and Disaggregated Systems (HCDS 2024)

arXiv:2404.02900 [pdf, other]

DeiT-LT Distillation Strikes Back for Vision Transformer Training on Long-Tailed Datasets

Authors: Harsh Rangwani, Pradipto Mondal, Mayank Mishra, Ashish Ramayee Asokan, R. Venkatesh Babu

Abstract: Vision Transformer (ViT) has emerged as a prominent architecture for various computer vision tasks. In ViT, we divide the input image into patch tokens and process them through a stack of self attention blocks. However, unlike Convolutional Neural Networks (CNN), ViTs simple architecture has no informative inductive bias (e.g., locality,etc. ). Due to this, ViT requires a large amount of data for… ▽ More Vision Transformer (ViT) has emerged as a prominent architecture for various computer vision tasks. In ViT, we divide the input image into patch tokens and process them through a stack of self attention blocks. However, unlike Convolutional Neural Networks (CNN), ViTs simple architecture has no informative inductive bias (e.g., locality,etc. ). Due to this, ViT requires a large amount of data for pre-training. Various data efficient approaches (DeiT) have been proposed to train ViT on balanced datasets effectively. However, limited literature discusses the use of ViT for datasets with long-tailed imbalances. In this work, we introduce DeiT-LT to tackle the problem of training ViTs from scratch on long-tailed datasets. In DeiT-LT, we introduce an efficient and effective way of distillation from CNN via distillation DIST token by using out-of-distribution images and re-weighting the distillation loss to enhance focus on tail classes. This leads to the learning of local CNN-like features in early ViT blocks, improving generalization for tail classes. Further, to mitigate overfitting, we propose distilling from a flat CNN teacher, which leads to learning low-rank generalizable features for DIST tokens across all ViT blocks. With the proposed DeiT-LT scheme, the distillation DIST token becomes an expert on the tail classes, and the classifier CLS token becomes an expert on the head classes. The experts help to effectively learn features corresponding to both the majority and minority classes using a distinct set of tokens within the same ViT architecture. We show the effectiveness of DeiT-LT for training ViT from scratch on datasets ranging from small-scale CIFAR-10 LT to large-scale iNaturalist-2018. △ Less

Submitted 3 April, 2024; originally announced April 2024.

Comments: CVPR 2024. Project Page: https://rangwani-harsh.github.io/DeiT-LT

arXiv:2404.02324 [pdf, other]

Learning from Demonstration Framework for Multi-Robot Systems Using Interaction Keypoints and Soft Actor-Critic Methods

Authors: Vishnunandan L. N. Venkatesh, Byung-Cheol Min

Abstract: Learning from Demonstration (LfD) is a promising approach to enable Multi-Robot Systems (MRS) to acquire complex skills and behaviors. However, the intricate interactions and coordination challenges in MRS pose significant hurdles for effective LfD. In this paper, we present a novel LfD framework specifically designed for MRS, which leverages visual demonstrations to capture and learn from robot-r… ▽ More Learning from Demonstration (LfD) is a promising approach to enable Multi-Robot Systems (MRS) to acquire complex skills and behaviors. However, the intricate interactions and coordination challenges in MRS pose significant hurdles for effective LfD. In this paper, we present a novel LfD framework specifically designed for MRS, which leverages visual demonstrations to capture and learn from robot-robot and robot-object interactions. Our framework introduces the concept of Interaction Keypoints (IKs) to transform the visual demonstrations into a representation that facilitates the inference of various skills necessary for the task. The robots then execute the task using sensorimotor actions and reinforcement learning (RL) policies when required. A key feature of our approach is the ability to handle unseen contact-based skills that emerge during the demonstration. In such cases, RL is employed to learn the skill using a classifier-based reward function, eliminating the need for manual reward engineering and ensuring adaptability to environmental changes. We evaluate our framework across a range of mobile robot tasks, covering both behavior-based and contact-based domains. The results demonstrate the effectiveness of our approach in enabling robots to learn complex multi-robot tasks and behaviors from visual demonstrations. △ Less

Submitted 2 April, 2024; originally announced April 2024.

arXiv:2404.02318 [pdf, other]

ZeroCAP: Zero-Shot Multi-Robot Context Aware Pattern Formation via Large Language Models

Authors: Vishnunandan L. N. Venkatesh, Byung-Cheol Min

Abstract: Incorporating language comprehension into robotic operations unlocks significant advancements in robotics, but also presents distinct challenges, particularly in executing spatially oriented tasks like pattern formation. This paper introduces ZeroCAP, a novel system that integrates large language models with multi-robot systems for zero-shot context aware pattern formation. Grounded in the princip… ▽ More Incorporating language comprehension into robotic operations unlocks significant advancements in robotics, but also presents distinct challenges, particularly in executing spatially oriented tasks like pattern formation. This paper introduces ZeroCAP, a novel system that integrates large language models with multi-robot systems for zero-shot context aware pattern formation. Grounded in the principles of language-conditioned robotics, ZeroCAP leverages the interpretative power of language models to translate natural language instructions into actionable robotic configurations. This approach combines the synergy of vision-language models, cutting-edge segmentation techniques and shape descriptors, enabling the realization of complex, context-driven pattern formations in the realm of multi robot coordination. Through extensive experiments, we demonstrate the systems proficiency in executing complex context aware pattern formations across a spectrum of tasks, from surrounding and caging objects to infilling regions. This not only validates the system's capability to interpret and implement intricate context-driven tasks but also underscores its adaptability and effectiveness across varied environments and scenarios. More details about this work are available at: https://sites.google.com/view/zerocap/home △ Less

Submitted 2 April, 2024; originally announced April 2024.

arXiv:2404.01035 [pdf, other]

MICROSIM: A high performance phase-field solver based on CPU and GPU implementations

Authors: Tanmay Dutta, Dasari Mohan, Saurav Shenoy, Nasir Attar, Abhikshek Kalokhe, Ajay Sagar, Swapnil Bhure, Swaroop . S. Pradhan, Jitendriya Praharaj, Subham Mridha, Anshika Kushwaha, Vaishali Shah, M. P. Gururajan, V. Venkatesh Shenoi, Gandham Phanikumar, Saswata Bhattacharyya, Abhik Choudhury

Abstract: The phase-field method has become a useful tool for the simulation of classical metallurgical phase transformations as well as other phenomena related to materials science. The thermodynamic consistency that forms the basis of these formulations lends to its strong predictive capabilities and utility. However, a strong impediment to the usage of the method for typical applied problems of industria… ▽ More The phase-field method has become a useful tool for the simulation of classical metallurgical phase transformations as well as other phenomena related to materials science. The thermodynamic consistency that forms the basis of these formulations lends to its strong predictive capabilities and utility. However, a strong impediment to the usage of the method for typical applied problems of industrial and academic relevance is the significant overhead with regard to the code development and know-how required for quantitative model formulations. In this paper, we report the development of an open-source phase-field software stack that contains generic formulations for the simulation of multi-phase and multi-component phase transformations. The solvers incorporate thermodynamic coupling that allows the realization of simulations with real alloys in scenarios directly relevant to the materials industry. Further, the solvers utilize parallelization strategies using either multiple CPUs or GPUs to provide cross-platform portability and usability on available supercomputing machines. Finally, the solver stack also contains a graphical user interface to gradually introduce the usage of the software. The user interface also provides a collection of post-processing tools that allow the estimation of useful metrics related to microstructural evolution. △ Less

Submitted 1 April, 2024; originally announced April 2024.

arXiv:2403.19822 [pdf, other]

Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition

Authors: Yash Jain, David Chan, Pranav Dheram, Aparna Khare, Olabanji Shonibare, Venkatesh Ravichandran, Shalini Ghosh

Abstract: Recent advances in machine learning have demonstrated that multi-modal pre-training can improve automatic speech recognition (ASR) performance compared to randomly initialized models, even when models are fine-tuned on uni-modal tasks. Existing multi-modal pre-training methods for the ASR task have primarily focused on single-stage pre-training where a single unsupervised task is used for pre-trai… ▽ More Recent advances in machine learning have demonstrated that multi-modal pre-training can improve automatic speech recognition (ASR) performance compared to randomly initialized models, even when models are fine-tuned on uni-modal tasks. Existing multi-modal pre-training methods for the ASR task have primarily focused on single-stage pre-training where a single unsupervised task is used for pre-training followed by fine-tuning on the downstream task. In this work, we introduce a novel method combining multi-modal and multi-task unsupervised pre-training with a translation-based supervised mid-training approach. We empirically demonstrate that such a multi-stage approach leads to relative word error rate (WER) improvements of up to 38.45% over baselines on both Librispeech and SUPERB. Additionally, we share several important findings for choosing pre-training methods and datasets. △ Less

Submitted 28 March, 2024; originally announced March 2024.

Comments: Accepted in LREC-COLING 2024 - The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation

arXiv:2403.18301 [pdf, other]

Selective Mixup Fine-Tuning for Optimizing Non-Decomposable Objectives

Authors: Shrinivas Ramasubramanian, Harsh Rangwani, Sho Takemori, Kunal Samanta, Yuhei Umeda, Venkatesh Babu Radhakrishnan

Abstract: The rise in internet usage has led to the generation of massive amounts of data, resulting in the adoption of various supervised and semi-supervised machine learning algorithms, which can effectively utilize the colossal amount of data to train models. However, before deploying these models in the real world, these must be strictly evaluated on performance measures like worst-case recall and satis… ▽ More The rise in internet usage has led to the generation of massive amounts of data, resulting in the adoption of various supervised and semi-supervised machine learning algorithms, which can effectively utilize the colossal amount of data to train models. However, before deploying these models in the real world, these must be strictly evaluated on performance measures like worst-case recall and satisfy constraints such as fairness. We find that current state-of-the-art empirical techniques offer sub-optimal performance on these practical, non-decomposable performance objectives. On the other hand, the theoretical techniques necessitate training a new model from scratch for each performance objective. To bridge the gap, we propose SelMix, a selective mixup-based inexpensive fine-tuning technique for pre-trained models, to optimize for the desired objective. The core idea of our framework is to determine a sampling distribution to perform a mixup of features between samples from particular classes such that it optimizes the given objective. We comprehensively evaluate our technique against the existing empirical and theoretically principled methods on standard benchmark datasets for imbalanced classification. We find that proposed SelMix fine-tuning significantly improves the performance for various practical non-decomposable objectives across benchmarks. △ Less

Submitted 27 March, 2024; originally announced March 2024.

Comments: ICLR 2024 SpotLight

arXiv:2403.17730 [pdf, ps, other]

On Structural Non-commutativity in Affine Feedback of SISO Nonlinear Systems

Authors: Venkatesh G. S.

Abstract: The affine feedback connection of SISO nonlinear systems modeled by Chen--Fliess series is shown to be a group action on the plant which is isomorphic to the semi-direct product of shuffle and additive group of non-commutative formal power series. The additive and multiplicative feedback loops in an affine feedback connection are thus proven to be structurally non-commutative. A flip in the order… ▽ More The affine feedback connection of SISO nonlinear systems modeled by Chen--Fliess series is shown to be a group action on the plant which is isomorphic to the semi-direct product of shuffle and additive group of non-commutative formal power series. The additive and multiplicative feedback loops in an affine feedback connection are thus proven to be structurally non-commutative. A flip in the order of these loops results in a net additive feedback loop. △ Less

Submitted 26 March, 2024; originally announced March 2024.

Comments: submitted to $26^{th}$ International Symposium on Mathematical Theory of Networks and Systems, 2024

arXiv:2403.10495 [pdf, other]

PnP Restoration with Domain Adaptation for SANS

Authors: Shirin Shoushtari, Edward P. Chandler, Jialiang Zhang, Manjula Senanayake, Sai Venkatesh **ali, Marcus Foston, Ulugbek S. Kamilov

Abstract: Small Angle Neutron Scattering (SANS) is a non-destructive technique utilized to probe the nano- to mesoscale structure of materials by analyzing the scattering pattern of neutrons. Accelerating SANS acquisition for in-situ analysis is essential, but it often reduces the signal-to-noise ratio (SNR), highlighting the need for methods to enhance SNR even with short acquisition times. While deep lear… ▽ More Small Angle Neutron Scattering (SANS) is a non-destructive technique utilized to probe the nano- to mesoscale structure of materials by analyzing the scattering pattern of neutrons. Accelerating SANS acquisition for in-situ analysis is essential, but it often reduces the signal-to-noise ratio (SNR), highlighting the need for methods to enhance SNR even with short acquisition times. While deep learning (DL) can be used for enhancing SNR of low quality SANS, the amount of experimental data available for training is usually severely limited. We address this issue by proposing a Plug-and-play Restoration for SANS (PR-SANS) that uses domain-adapted priors. The prior in PR-SANS is initially trained on a set of generic images and subsequently fine-tuned using a limited amount of experimental SANS data. We present a theoretical convergence analysis of PR-SANS by focusing on the error resulting from using inexact domain-adapted priors instead of the ideal ones. We demonstrate with experimentally collected SANS data that PR-SANS can recover high-SNR 2D SANS detector images from low-SNR detector images, effectively increasing the SNR. This advancement enables a reduction in acquisition times by a factor of 12 while maintaining the original signal quality. △ Less

Submitted 15 March, 2024; originally announced March 2024.

arXiv:2403.04311 [pdf, other]

ALTO: An Efficient Network Orchestrator for Compound AI Systems

Authors: Keshav Santhanam, Deepti Raghavan, Muhammad Shahir Rahman, Thejas Venkatesh, Neha Kunjal, Pratiksha Thaker, Philip Levis, Matei Zaharia

Abstract: We present ALTO, a network orchestrator for efficiently serving compound AI systems such as pipelines of language models. ALTO achieves high throughput and low latency by taking advantage of an optimization opportunity specific to generative language models: streaming intermediate outputs. As language models produce outputs token by token, ALTO exposes opportunities to stream intermediate outputs… ▽ More We present ALTO, a network orchestrator for efficiently serving compound AI systems such as pipelines of language models. ALTO achieves high throughput and low latency by taking advantage of an optimization opportunity specific to generative language models: streaming intermediate outputs. As language models produce outputs token by token, ALTO exposes opportunities to stream intermediate outputs between stages when possible. We highlight two new challenges of correctness and load balancing which emerge when streaming intermediate data across distributed pipeline stage instances. We also motivate the need for an aggregation-aware routing interface and distributed prompt-aware scheduling to address these challenges. We demonstrate the impact of ALTO's partial output streaming on a complex chatbot verification pipeline, increasing throughput by up to 3x for a fixed latency target of 4 seconds / request while also reducing tail latency by 1.8x compared to a baseline serving approach. △ Less

Submitted 7 March, 2024; originally announced March 2024.

arXiv:2403.04187 [pdf, other]

Preference optimization of protein language models as a multi-objective binder design paradigm

Authors: Pouria Mistani, Venkatesh Mysore

Abstract: We present a multi-objective binder design paradigm based on instruction fine-tuning and direct preference optimization (DPO) of autoregressive protein language models (pLMs). Multiple design objectives are encoded in the language model through direct optimization on expert curated preference sequence datasets comprising preferred and dispreferred distributions. We show the proposed alignment stra… ▽ More We present a multi-objective binder design paradigm based on instruction fine-tuning and direct preference optimization (DPO) of autoregressive protein language models (pLMs). Multiple design objectives are encoded in the language model through direct optimization on expert curated preference sequence datasets comprising preferred and dispreferred distributions. We show the proposed alignment strategy enables ProtGPT2 to effectively design binders conditioned on specified receptors and a drug developability criterion. Generated binder samples demonstrate median isoelectric point (pI) improvements by $17\%-60\%$. △ Less

Submitted 6 March, 2024; originally announced March 2024.

Comments: Published at the GEM workshop, ICLR 2024. Generative and Experimental Perspectives for Biomolecular Design (https://www.gembio.ai/)

arXiv:2403.02863 [pdf, other]

Domain wall and Magnetic Tunnel Junction Hybrid for on-chip Learning in UNet architecture

Authors: Venkatesh Vadde, Bhaskaran Muralidharan, Abhishek Sharma

Abstract: We present spintronic devices based hardware implementation of UNet for segmentation tasks. Our approach involves designing hardware for convolution, deconvolution, rectified activation function (ReLU), and max pooling layers of the UNet architecture. We designed the convolution and deconvolution layers of the network using the synaptic behavior of the domain wall MTJ. We also construct the ReLU a… ▽ More We present spintronic devices based hardware implementation of UNet for segmentation tasks. Our approach involves designing hardware for convolution, deconvolution, rectified activation function (ReLU), and max pooling layers of the UNet architecture. We designed the convolution and deconvolution layers of the network using the synaptic behavior of the domain wall MTJ. We also construct the ReLU and max pooling functions of the network utilizing the spin hall driven orthogonal current injected MTJ. To incorporate the diverse physics of spin-transport, magnetization dynamics, and CMOS elements in our UNet design, we employ a hybrid simulation setup that couples micromagnetic simulation, non-equilibrium Green's function, SPICE simulation along with network implementation. We evaluate our UNet design on the CamVid dataset and achieve segmentation accuracies of 83.71$\%$ on test data, on par with the software implementation with 821mJ of energy consumption for on-chip training over 150 epochs. We further demonstrate nearly one order $(10\times)$ improvement in the energy requirement of the network using unstable ferromagnet ($Δ$=4.58) over the stable ferromagnet ($Δ$=45) based ReLU and max pooling functions while maintaining the similar accuracy. The hybrid architecture comprising domain wall MTJ and unstable FM-based MTJ leads to an on-chip energy consumption of 85.79mJ during training, with a testing energy cost of 1.55 $μJ$. △ Less

Submitted 11 July, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

arXiv:2403.01281 [pdf, other]

Fast Low-parameter Video Activity Localization in Collaborative Learning Environments

Authors: Venkatesh Jatla, Sravani Teeparthi, Ugesh Egala, Sylvia Celedon Pattichis, Marios S. Patticis

Abstract: Research on video activity detection has primarily focused on identifying well-defined human activities in short video segments. The majority of the research on video activity recognition is focused on the development of large parameter systems that require training on large video datasets. This paper develops a low-parameter, modular system with rapid inferencing capabilities that can be trained… ▽ More Research on video activity detection has primarily focused on identifying well-defined human activities in short video segments. The majority of the research on video activity recognition is focused on the development of large parameter systems that require training on large video datasets. This paper develops a low-parameter, modular system with rapid inferencing capabilities that can be trained entirely on limited datasets without requiring transfer learning from large-parameter systems. The system can accurately detect and associate specific activities with the students who perform the activities in real-life classroom videos. Additionally, the paper develops an interactive web-based application to visualize human activity maps over long real-life classroom videos. △ Less

Submitted 9 March, 2024; v1 submitted 2 March, 2024; originally announced March 2024.

Showing 1–50 of 1,014 results for author: Venkatesh