Search | arXiv e-print repository

Relative Group Trisections

Authors: Nickolas Andres Castro, Jason Joseph, Patrick K. McFaddin

Abstract: Trisections of closed 4-manifolds, first defined and studied by Gay and Kirby, have proved to be a useful tool in the systematic analysis of 4-manifolds via handlebodies. Subsequent work of Abrams, Gay, and Kirby established a connection with the algebraic notion of a group trisection, which strikingly defines a one-to-one correspondence. We generalize the notion of a group trisection to the non-c… ▽ More Trisections of closed 4-manifolds, first defined and studied by Gay and Kirby, have proved to be a useful tool in the systematic analysis of 4-manifolds via handlebodies. Subsequent work of Abrams, Gay, and Kirby established a connection with the algebraic notion of a group trisection, which strikingly defines a one-to-one correspondence. We generalize the notion of a group trisection to the non-closed case by defining and studying relative group trisections. We establish an analogous one-to-one correspondence between relative trisections and relative group trisections up to equivalence. The key lemma in the construction may be of independent interest, as it generalizes the classical fact that there is a unique handlebody extension of a surface realizing a given surjection. Moreover, we establish a functorial relationship between relative trisections of manifolds and groups, extending work of Klug in the closed case. △ Less

Submitted 20 June, 2024; originally announced June 2024.

Comments: 16 pages, 1 figure

arXiv:2406.04673 [pdf, other]

MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models

Authors: Sanjoy Chowdhury, Sayan Nag, K J Joseph, Balaji Vasan Srinivasan, Dinesh Manocha

Abstract: Music is a universal language that can communicate emotions and feelings. It forms an essential part of the whole spectrum of creative media, ranging from movies to social media posts. Machine learning models that can synthesize music are predominantly conditioned on textual descriptions of it. Inspired by how musicians compose music not just from a movie script, but also through visualizations, w… ▽ More Music is a universal language that can communicate emotions and feelings. It forms an essential part of the whole spectrum of creative media, ranging from movies to social media posts. Machine learning models that can synthesize music are predominantly conditioned on textual descriptions of it. Inspired by how musicians compose music not just from a movie script, but also through visualizations, we propose MeLFusion, a model that can effectively use cues from a textual description and the corresponding image to synthesize music. MeLFusion is a text-to-music diffusion model with a novel "visual synapse", which effectively infuses the semantics from the visual modality into the generated music. To facilitate research in this area, we introduce a new dataset MeLBench, and propose a new evaluation metric IMSM. Our exhaustive experimental evaluation suggests that adding visual information to the music synthesis pipeline significantly improves the quality of generated music, measured both objectively and subjectively, with a relative gain of up to 67.98% on the FAD score. We hope that our work will gather attention to this pragmatic, yet relatively under-explored research area. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: Accepted at CVPR 2024 as Highlight paper. Webpage: https://schowdhury671.github.io/melfusion_cvpr2024/

arXiv:2405.11511 [pdf, other]

Online Action Representation using Change Detection and Symbolic Programming

Authors: Vishnu S Nair, Sneha Sree, Jayaraj Joseph, Mohanasankar Sivaprakasam

Abstract: This paper addresses the critical need for online action representation, which is essential for various applications like rehabilitation, surveillance, etc. The task can be defined as representation of actions as soon as they happen in a streaming video without access to video frames in the future. Most of the existing methods use predefined window sizes for video segments, which is a restrictive… ▽ More This paper addresses the critical need for online action representation, which is essential for various applications like rehabilitation, surveillance, etc. The task can be defined as representation of actions as soon as they happen in a streaming video without access to video frames in the future. Most of the existing methods use predefined window sizes for video segments, which is a restrictive assumption on the dynamics. The proposed method employs a change detection algorithm to automatically segment action sequences, which form meaningful sub-actions and subsequently fit symbolic generative motion programs to the clipped segments. We determine the start time and end time of segments using change detection followed by a piece-wise linear fit algorithm on joint angle and bone length sequences. Domain-specific symbolic primitives are fit to pose keypoint trajectories of those extracted segments in order to obtain a higher level semantic representation. Since this representation is part-based, it is complementary to the compositional nature of human actions, i.e., a complex activity can be broken down into elementary sub-actions. We show the effectiveness of this representation in the downstream task of class agnostic repetition detection. We propose a repetition counting algorithm based on consecutive similarity matching of primitives, which can do online repetition counting. We also compare the results with a similar but offline repetition counting algorithm. The results of the experiments demonstrate that, despite operating online, the proposed method performs better or on par with the existing method. △ Less

Submitted 19 May, 2024; originally announced May 2024.

arXiv:2405.04326 [pdf, other]

A Calibratable Model for Fast Energy Estimation of MVM Operations on RRAM Crossbars

Authors: José Cubero-Cascante, Arunkumar Vaidyanathan, Rebecca Pelke, Lorenzo Pfeifer, Rainer Leupers, Jan Moritz Joseph

Abstract: The surge in AI usage demands innovative power reduction strategies. Novel Compute-in-Memory (CIM) architectures, leveraging advanced memory technologies, hold the potential for significantly lowering energy consumption by integrating storage with parallel Matrix-Vector-Multiplications (MVMs). This study addresses the 1T1R RRAM crossbar, a core component in numerous CIM architectures. We introduce… ▽ More The surge in AI usage demands innovative power reduction strategies. Novel Compute-in-Memory (CIM) architectures, leveraging advanced memory technologies, hold the potential for significantly lowering energy consumption by integrating storage with parallel Matrix-Vector-Multiplications (MVMs). This study addresses the 1T1R RRAM crossbar, a core component in numerous CIM architectures. We introduce an abstract model and a calibration methodology for estimating operational energy. Our tool condenses circuit-level behaviour into a few parameters, facilitating energy assessments for DNN workloads. Validation against low-level SPICE simulations demonstrates speedups of up to 1000x and energy estimations with errors below 1%. △ Less

Submitted 13 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

Comments: Pre-print of work presented at AICAS 2024. 5 pages, 6 figures

ACM Class: C.3; I.2; I.6

arXiv:2403.13655 [pdf, other]

A Fully Automated Platform for Evaluating ReRAM Crossbars

Authors: Rebecca Pelke, Felix Staudigl, Niklas Thomas, Nils Bosbach, Mohammed Hossein, Jose Cubero-Cascante, Leticia Bolzani Poehls, Rainer Leupers, Jan Moritz Joseph

Abstract: Resistive Random Access Memory (ReRAM) is a promising candidate for implementing Computing-in-Memory (CIM) architectures and neuromorphic circuits. ReRAM cells exhibit significant variability across different memristive devices and cycles, necessitating further improvements in the areas of devices, algorithms, and applications. To achieve this, understanding the stochastic behavior of the differen… ▽ More Resistive Random Access Memory (ReRAM) is a promising candidate for implementing Computing-in-Memory (CIM) architectures and neuromorphic circuits. ReRAM cells exhibit significant variability across different memristive devices and cycles, necessitating further improvements in the areas of devices, algorithms, and applications. To achieve this, understanding the stochastic behavior of the different ReRAM technologies is essential. The NeuroBreakoutBoard (NBB) is a versatile instrumentation platform to characterize Non-Volatile Memories (NVMs). However, the NBB itself does not provide any functionality in the form of software or a controller. In this paper, we present a control board for the NBB able to perform reliability assessments of 1T1R ReRAM crossbars. In more detail, an interface that allows a host PC to communicate with the NBB via the new control board is implemented. In a case study, we analyze the Cycle-to-Cycle (C2C) variation and read disturb TiN/Ti/HfO2/TiN cells for different read voltages to gain an understanding of their operational behavior. △ Less

Submitted 20 March, 2024; originally announced March 2024.

arXiv:2402.06185 [pdf, other]

Development and validation of an artificial intelligence model to accurately predict spinopelvic parameters

Authors: Edward S. Harake, Joseph R. Linzey, Cheng Jiang, Rushikesh S. Joshi, Mark M. Zaki, Jaes C. Jones, Siri S. Khalsa, John H. Lee, Zachary Wilseck, Jacob R. Joseph, Todd C. Hollon, Paul Park

Abstract: Objective. Achieving appropriate spinopelvic alignment has been shown to be associated with improved clinical symptoms. However, measurement of spinopelvic radiographic parameters is time-intensive and interobserver reliability is a concern. Automated measurement tools have the promise of rapid and consistent measurements, but existing tools are still limited by some degree of manual user-entry re… ▽ More Objective. Achieving appropriate spinopelvic alignment has been shown to be associated with improved clinical symptoms. However, measurement of spinopelvic radiographic parameters is time-intensive and interobserver reliability is a concern. Automated measurement tools have the promise of rapid and consistent measurements, but existing tools are still limited by some degree of manual user-entry requirements. This study presents a novel artificial intelligence (AI) tool called SpinePose that automatically predicts spinopelvic parameters with high accuracy without the need for manual entry. Methods. SpinePose was trained and validated on 761 sagittal whole-spine X-rays to predict sagittal vertical axis (SVA), pelvic tilt (PT), pelvic incidence (PI), sacral slope (SS), lumbar lordosis (LL), T1-pelvic angle (T1PA), and L1-pelvic angle (L1PA). A separate test set of 40 X-rays was labeled by 4 reviewers, including fellowship-trained spine surgeons and a fellowship-trained radiologist with neuroradiology subspecialty certification. Median errors relative to the most senior reviewer were calculated to determine model accuracy on test images. Intraclass correlation coefficients (ICC) were used to assess inter-rater reliability. Results. SpinePose exhibited the following median (interquartile range) parameter errors: SVA: 2.2(2.3)mm, p=0.93; PT: 1.3(1.2)°, p=0.48; SS: 1.7(2.2)°, p=0.64; PI: 2.2(2.1)°, p=0.24; LL: 2.6(4.0)°, p=0.89; T1PA: 1.1(0.9)°, p=0.42; and L1PA: 1.4(1.6)°, p=0.49. Model predictions also exhibited excellent reliability at all parameters (ICC: 0.91-1.0). Conclusions. SpinePose accurately predicted spinopelvic parameters with excellent reliability comparable to fellowship-trained spine surgeons and neuroradiologists. Utilization of predictive AI tools in spinal imaging can substantially aid in patient selection and surgical planning. △ Less

Submitted 8 February, 2024; originally announced February 2024.

Comments: 10 pages, 5 figures, to appear in Journal of Neurosurgery: Spine

arXiv:2401.07671 [pdf, other]

CLSA-CIM: A Cross-Layer Scheduling Approach for Computing-in-Memory Architectures

Authors: Rebecca Pelke, Jose Cubero-Cascante, Nils Bosbach, Felix Staudigl, Rainer Leupers, Jan Moritz Joseph

Abstract: The demand for efficient machine learning (ML) accelerators is growing rapidly, driving the development of novel computing concepts such as resistive random access memory (RRAM)-based tiled computing-in-memory (CIM) architectures. CIM allows to compute within the memory unit, resulting in faster data processing and reduced power consumption. Efficient compiler algorithms are essential to exploit t… ▽ More The demand for efficient machine learning (ML) accelerators is growing rapidly, driving the development of novel computing concepts such as resistive random access memory (RRAM)-based tiled computing-in-memory (CIM) architectures. CIM allows to compute within the memory unit, resulting in faster data processing and reduced power consumption. Efficient compiler algorithms are essential to exploit the potential of tiled CIM architectures. While conventional ML compilers focus on code generation for CPUs, GPUs, and other von Neumann architectures, adaptations are needed to cover CIM architectures. Cross-layer scheduling is a promising approach, as it enhances the utilization of CIM cores, thereby accelerating computations. Although similar concepts are implicitly used in previous work, there is a lack of clear and quantifiable algorithmic definitions for cross-layer scheduling for tiled CIM architectures. To close this gap, we present CLSA-CIM, a cross-layer scheduling algorithm for tiled CIM architectures. We integrate CLSA-CIM with existing weight-map** strategies and compare performance against state-of-the-art (SOTA) scheduling algorithms. CLSA-CIM improves the utilization by up to 17.9 x , resulting in an overall speedup increase of up to 29.2 x compared to SOTA. △ Less

Submitted 17 January, 2024; v1 submitted 15 January, 2024; originally announced January 2024.

arXiv:2309.03805 [pdf, other]

Map** of CNNs on multi-core RRAM-based CIM architectures

Authors: Rebecca Pelke, Nils Bosbach, Jose Cubero, Felix Staudigl, Rainer Leupers, Jan Moritz Joseph

Abstract: RRAM-based multi-core systems improve the energy efficiency and performance of CNNs. Thereby, the distributed parallel execution of convolutional layers causes critical data dependencies that limit the potential speedup. This paper presents synchronization techniques for parallel inference of convolutional layers on RRAM-based CIM architectures. We propose an architecture optimization that enables… ▽ More RRAM-based multi-core systems improve the energy efficiency and performance of CNNs. Thereby, the distributed parallel execution of convolutional layers causes critical data dependencies that limit the potential speedup. This paper presents synchronization techniques for parallel inference of convolutional layers on RRAM-based CIM architectures. We propose an architecture optimization that enables efficient data exchange and discuss the impact of different architecture setups on the performance. The corresponding compiler algorithms are optimized for high speedup and low memory consumption during CNN inference. We achieve more than 99% of the theoretical acceleration limit with a marginal data transmission overhead of less than 4% for state-of-the-art CNN benchmarks. △ Less

Submitted 26 October, 2023; v1 submitted 7 September, 2023; originally announced September 2023.

arXiv:2309.00613 [pdf, other]

Iterative Multi-granular Image Editing using Diffusion Models

Authors: K J Joseph, Prateksha Udhayanan, Tripti Shukla, Aishwarya Agarwal, Srikrishna Karanam, Koustava Goswami, Balaji Vasan Srinivasan

Abstract: Recent advances in text-guided image synthesis has dramatically changed how creative professionals generate artistic and aesthetically pleasing visual assets. To fully support such creative endeavors, the process should possess the ability to: 1) iteratively edit the generations and 2) control the spatial reach of desired changes (global, local or anything in between). We formalize this pragmatic… ▽ More Recent advances in text-guided image synthesis has dramatically changed how creative professionals generate artistic and aesthetically pleasing visual assets. To fully support such creative endeavors, the process should possess the ability to: 1) iteratively edit the generations and 2) control the spatial reach of desired changes (global, local or anything in between). We formalize this pragmatic problem setting as Iterative Multi-granular Editing. While there has been substantial progress with diffusion-based models for image synthesis and editing, they are all one shot (i.e., no iterative editing capabilities) and do not naturally yield multi-granular control (i.e., covering the full spectrum of local-to-global edits). To overcome these drawbacks, we propose EMILIE: Iterative Multi-granular Image Editor. EMILIE introduces a novel latent iteration strategy, which re-purposes a pre-trained diffusion model to facilitate iterative editing. This is complemented by a gradient control operation for multi-granular control. We introduce a new benchmark dataset to evaluate our newly proposed setting. We conduct exhaustive quantitatively and qualitatively evaluation against recent state-of-the-art approaches adapted to our task, to being out the mettle of EMILIE. We hope our work would attract attention to this newly identified, pragmatic problem setting. △ Less

Submitted 28 October, 2023; v1 submitted 1 September, 2023; originally announced September 2023.

Comments: Accepted to IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2024

arXiv:2308.09445 [pdf, other]

doi 10.1007/978-3-031-46077-7_12

parti-gem5: gem5's Timing Mode Parallelised

Authors: José Cubero-Cascante, Niko Zurstraßen, Jörn Nöller, Rainer Leupers, Jan Moritz Joseph

Abstract: Detailed timing models are indispensable tools for the design space exploration of Multiprocessor Systems on Chip (MPSoCs). As core counts continue to increase, the complexity in memory hierarchies and interconnect topologies is also growing, making accurate predictions of design decisions more challenging than ever. In this context, the open-source Full System Simulator (FSS) gem5 is a popular ch… ▽ More Detailed timing models are indispensable tools for the design space exploration of Multiprocessor Systems on Chip (MPSoCs). As core counts continue to increase, the complexity in memory hierarchies and interconnect topologies is also growing, making accurate predictions of design decisions more challenging than ever. In this context, the open-source Full System Simulator (FSS) gem5 is a popular choice for MPSoC design space exploration, thanks to its flexibility and robust set of detailed timing models. However, its single-threaded simulation kernel severely hampers its throughput. To address this challenge, we introduce parti-gem5, an extension of gem5 that enables parallel timing simulations on modern multi-core simulation hosts. Unlike previous works, parti-gem5 supports gem5's timing mode, the O3CPU, and Ruby's custom cache and interconnect models. Compared to reference single-thread simulations, we achieved speedups of up to 42.7x when simulating a 120-core ARM MPSoC on a 64-core x86-64 host system. While our method introduces timing deviations, the error in total simulated time is below 15% in most cases. △ Less

Submitted 13 May, 2024; v1 submitted 18 August, 2023; originally announced August 2023.

Comments: Pre-print of work presented at SAMOS Conference XXIII

ACM Class: I.6.0

arXiv:2308.03881 [pdf, other]

doi 10.1051/0004-6361/202245601

Measuring the Numerical Viscosity in Simulations of Protoplanetary Disks in Cartesian Grids -- The Viscously Spreading Ring Revisited

Authors: Jibin Joseph, Alexandros Ziampras, Lucas Jordan, George A. Turpin, Richard P. Nelson

Abstract: Hydrodynamical simulations solve the governing equations on a discrete grid of space and time. This discretization causes numerical diffusion similar to a physical viscous diffusion, whose magnitude is often unknown or poorly constrained. With the current trend of simulating accretion disks with no or very low prescribed physical viscosity, it becomes essential to understand and quantify this inhe… ▽ More Hydrodynamical simulations solve the governing equations on a discrete grid of space and time. This discretization causes numerical diffusion similar to a physical viscous diffusion, whose magnitude is often unknown or poorly constrained. With the current trend of simulating accretion disks with no or very low prescribed physical viscosity, it becomes essential to understand and quantify this inherent numerical diffusion, in the form of a numerical viscosity. We study the behavior of the viscous spreading ring and the spiral instability that develops in it. We then use this setup to quantify the numerical viscosity in Cartesian grids and study its properties. We simulate the viscous spreading ring and the related instability on a two-dimensional polar grid using PLUTO as well as FARGO, and ensure convergence of our results with a resolution study. We then repeat our models on a Cartesian grid and measure the numerical viscosity by comparing results to the known analytical solution, using PLUTO and Athena++. We find that the numerical viscosity in a Cartesian grid scales with resolution as approximately $ν_{num}\proptoΔx^2$ and is equivalent to an effective $α\sim10^{-4}$ for a common numerical setup. We also show that the spiral instability manifests as a single leading spiral throughout the whole domain on polar grids. This is contrary to previous results and indicates that sufficient resolution is necessary in order to correctly resolve the instability. Our results are relevant in the context of models where the origin should be included in the computational domain, or when polar grids cannot be used. Examples of such cases include models of disk accretion onto a central binary and inherently Cartesian codes. △ Less

Submitted 7 August, 2023; originally announced August 2023.

Journal ref: A&A 678, A134 (2023)

arXiv:2308.02400 [pdf, other]

Work-in-Progress: A Universal Instrumentation Platform for Non-Volatile Memories

Authors: Felix Staudigl, Mohammed Hossein, Tobias Ziegler, Hazem Al Indari, Rebecca Pelke, Sebastian Siegel, Dirk J. Wouters, Dominik Sisejkovic, Jan Moritz Joseph, Rainer Leupers

Abstract: Emerging non-volatile memories (NVMs) represent a disruptive technology that allows a paradigm shift from the conventional von Neumann architecture towards more efficient computing-in-memory (CIM) architectures. Several instrumentation platforms have been proposed to interface NVMs allowing the characterization of single cells and crossbar structures. However, these platforms suffer from low flexi… ▽ More Emerging non-volatile memories (NVMs) represent a disruptive technology that allows a paradigm shift from the conventional von Neumann architecture towards more efficient computing-in-memory (CIM) architectures. Several instrumentation platforms have been proposed to interface NVMs allowing the characterization of single cells and crossbar structures. However, these platforms suffer from low flexibility and are not capable of performing CIM operations on NVMs. Therefore, we recently designed and built the NeuroBreakoutBoard, a highly versatile instrumentation platform capable of executing CIM on NVMs. We present our preliminary results demonstrating a relative error < 5% in the range of 1 k$Ω$ to 1 M$Ω$ and showcase the switching behavior of a HfO$_2$/Ti-based memristive cell. △ Less

Submitted 3 August, 2023; originally announced August 2023.

arXiv:2307.00910 [pdf, other]

CoPL: Contextual Prompt Learning for Vision-Language Understanding

Authors: Koustava Goswami, Srikrishna Karanam, Prateksha Udhayanan, K J Joseph, Balaji Vasan Srinivasan

Abstract: Recent advances in multimodal learning has resulted in powerful vision-language models, whose representations are generalizable across a variety of downstream tasks. Recently, their generalization ability has been further extended by incorporating trainable prompts, borrowed from the natural language processing literature. While such prompt learning techniques have shown impressive results, we ide… ▽ More Recent advances in multimodal learning has resulted in powerful vision-language models, whose representations are generalizable across a variety of downstream tasks. Recently, their generalization ability has been further extended by incorporating trainable prompts, borrowed from the natural language processing literature. While such prompt learning techniques have shown impressive results, we identify that these prompts are trained based on global image features which limits itself in two aspects: First, by using global features, these prompts could be focusing less on the discriminative foreground image, resulting in poor generalization to various out-of-distribution test cases. Second, existing work weights all prompts equally whereas intuitively, prompts should be reweighed according to the semantics of the image. We address these as part of our proposed Contextual Prompt Learning (CoPL) framework, capable of aligning the prompts to the localized features of the image. Our key innovations over earlier works include using local image features as part of the prompt learning process, and more crucially, learning to weight these prompts based on local features that are appropriate for the task at hand. This gives us dynamic prompts that are both aligned to local image features as well as aware of local contextual relationships. Our extensive set of experiments on a variety of standard and few-shot datasets show that our method produces substantially improved performance when compared to the current state of the art methods. We also demonstrate both few-shot and out-of-distribution performance to establish the utility of learning dynamic prompts that are aligned to local image features. △ Less

Submitted 12 December, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

Comments: Accepted at AAAI 2024

arXiv:2306.14544 [pdf, other]

A-STAR: Test-time Attention Segregation and Retention for Text-to-image Synthesis

Authors: Aishwarya Agarwal, Srikrishna Karanam, K J Joseph, Apoorv Saxena, Koustava Goswami, Balaji Vasan Srinivasan

Abstract: While recent developments in text-to-image generative models have led to a suite of high-performing methods capable of producing creative imagery from free-form text, there are several limitations. By analyzing the cross-attention representations of these models, we notice two key issues. First, for text prompts that contain multiple concepts, there is a significant amount of pixel-space overlap (… ▽ More While recent developments in text-to-image generative models have led to a suite of high-performing methods capable of producing creative imagery from free-form text, there are several limitations. By analyzing the cross-attention representations of these models, we notice two key issues. First, for text prompts that contain multiple concepts, there is a significant amount of pixel-space overlap (i.e., same spatial regions) among pairs of different concepts. This eventually leads to the model being unable to distinguish between the two concepts and one of them being ignored in the final generation. Next, while these models attempt to capture all such concepts during the beginning of denoising (e.g., first few steps) as evidenced by cross-attention maps, this knowledge is not retained by the end of denoising (e.g., last few steps). Such loss of knowledge eventually leads to inaccurate generation outputs. To address these issues, our key innovations include two test-time attention-based loss functions that substantially improve the performance of pretrained baseline text-to-image diffusion models. First, our attention segregation loss reduces the cross-attention overlap between attention maps of different concepts in the text prompt, thereby reducing the confusion/conflict among various concepts and the eventual capture of all concepts in the generated output. Next, our attention retention loss explicitly forces text-to-image diffusion models to retain cross-attention information for all concepts across all denoising time steps, thereby leading to reduced information loss and the preservation of all concepts in the generated output. △ Less

Submitted 26 June, 2023; originally announced June 2023.

Comments: 15 pages, 16 figures

arXiv:2305.19956 [pdf, other]

doi 10.1016/j.compmedimag.2024.102326

MicroSegNet: A Deep Learning Approach for Prostate Segmentation on Micro-Ultrasound Images

Authors: Hongxu Jiang, Muhammad Imran, Preethika Muralidharan, Anjali Patel, Jake Pensa, Muxuan Liang, Tarik Benidir, Joseph R. Grajo, Jason P. Joseph, Russell Terry, John Michael DiBianco, Li-Ming Su, Yuyin Zhou, Wayne G. Brisbane, Wei Shao

Abstract: Micro-ultrasound (micro-US) is a novel 29-MHz ultrasound technique that provides 3-4 times higher resolution than traditional ultrasound, potentially enabling low-cost, accurate diagnosis of prostate cancer. Accurate prostate segmentation is crucial for prostate volume measurement, cancer diagnosis, prostate biopsy, and treatment planning. However, prostate segmentation on micro-US is challenging… ▽ More Micro-ultrasound (micro-US) is a novel 29-MHz ultrasound technique that provides 3-4 times higher resolution than traditional ultrasound, potentially enabling low-cost, accurate diagnosis of prostate cancer. Accurate prostate segmentation is crucial for prostate volume measurement, cancer diagnosis, prostate biopsy, and treatment planning. However, prostate segmentation on micro-US is challenging due to artifacts and indistinct borders between the prostate, bladder, and urethra in the midline. This paper presents MicroSegNet, a multi-scale annotation-guided transformer UNet model designed specifically to tackle these challenges. During the training process, MicroSegNet focuses more on regions that are hard to segment (hard regions), characterized by discrepancies between expert and non-expert annotations. We achieve this by proposing an annotation-guided binary cross entropy (AG-BCE) loss that assigns a larger weight to prediction errors in hard regions and a lower weight to prediction errors in easy regions. The AG-BCE loss was seamlessly integrated into the training process through the utilization of multi-scale deep supervision, enabling MicroSegNet to capture global contextual dependencies and local information at various scales. We trained our model using micro-US images from 55 patients, followed by evaluation on 20 patients. Our MicroSegNet model achieved a Dice coefficient of 0.939 and a Hausdorff distance of 2.02 mm, outperforming several state-of-the-art segmentation methods, as well as three human annotators with different experience levels. Our code is publicly available at https://github.com/mirthAI/MicroSegNet and our dataset is publicly available at https://zenodo.org/records/10475293. △ Less

Submitted 25 January, 2024; v1 submitted 31 May, 2023; originally announced May 2023.

Journal ref: Computerized Medical Imaging and Graphics (2024): 102326

arXiv:2305.11961 [pdf, other]

The 4D Camera: an 87 kHz direct electron detector for scanning/transmission electron microscopy

Authors: Peter Ercius, Ian J. Johnson, Philipp Pelz, Benjamin H. Savitzky, Lauren Hughes, Hamish G. Brown, Steven E. Zeltmann, Shang-Lin Hsu, Cassio C. S. Pedroso, Bruce E. Cohen, Ramamoorthy Ramesh, David Paul, John M. Joseph, Thorsten Stezelberger, Cory Czarnik, Matthew Lent, Erin Fong, Jim Ciston, Mary C. Scott, Colin Ophus, Andrew M. Minor, and Peter Denes

Abstract: We describe the development, operation, and application of the 4D Camera -- a 576 by 576 pixel active pixel sensor for scanning/transmission electron microscopy which operates at 87,000 Hz. The detector generates data at approximately 480 Gbit/s which is captured by dedicated receiver computers with a parallelized software infrastructure that has been implemented to process the resulting 10 - 700… ▽ More We describe the development, operation, and application of the 4D Camera -- a 576 by 576 pixel active pixel sensor for scanning/transmission electron microscopy which operates at 87,000 Hz. The detector generates data at approximately 480 Gbit/s which is captured by dedicated receiver computers with a parallelized software infrastructure that has been implemented to process the resulting 10 - 700 Gigabyte-sized raw datasets. The back illuminated detector provides the ability to detect single electron events at accelerating voltages from 30 - 300 keV. Through electron counting, the resulting sparse data sets are reduced in size by 10 - 300x compared to the raw data, and open-source sparsity-based processing algorithms offer rapid data analysis. The high frame rate allows for large and complex 4D-STEM experiments to be accomplished with typical STEM scanning parameters. △ Less

Submitted 19 May, 2023; originally announced May 2023.

arXiv:2303.14772 [pdf, other]

$Δ$-Patching: A Framework for Rapid Adaptation of Pre-trained Convolutional Networks without Base Performance Loss

Authors: Chaitanya Devaguptapu, Samarth Sinha, K J Joseph, Vineeth N Balasubramanian, Animesh Garg

Abstract: Models pre-trained on large-scale datasets are often fine-tuned to support newer tasks and datasets that arrive over time. This process necessitates storing copies of the model over time for each task that the pre-trained model is fine-tuned to. Building on top of recent model patching work, we propose $Δ$-Patching for fine-tuning neural network models in an efficient manner, without the need to s… ▽ More Models pre-trained on large-scale datasets are often fine-tuned to support newer tasks and datasets that arrive over time. This process necessitates storing copies of the model over time for each task that the pre-trained model is fine-tuned to. Building on top of recent model patching work, we propose $Δ$-Patching for fine-tuning neural network models in an efficient manner, without the need to store model copies. We propose a simple and lightweight method called $Δ$-Networks to achieve this objective. Our comprehensive experiments across setting and architecture variants show that $Δ$-Networks outperform earlier model patching work while only requiring a fraction of parameters to be trained. We also show that this approach can be used for other problem settings such as transfer learning and zero-shot domain adaptation, as well as other tasks such as detection and segmentation. △ Less

Submitted 21 September, 2023; v1 submitted 26 March, 2023; originally announced March 2023.

arXiv:2302.07655 [pdf, other]

Fault Injection in Native Logic-in-Memory Computation on Neuromorphic Hardware

Authors: Felix Staudigl, Thorben Fetz, Rebecca Pelke, Dominik Sisejkovic, Jan Moritz Joseph, Leticia Bolzani Pöhls, Rainer Leupers

Abstract: Logic-in-memory (LIM) describes the execution of logic gates within memristive crossbar structures, promising to improve performance and energy efficiency. Utilizing only binary values, LIM particularly excels in accelerating binary neural networks, shifting it in the focus of edge applications. Considering its potential, the impact of faults on BNNs accelerated with LIM still lacks investigation.… ▽ More Logic-in-memory (LIM) describes the execution of logic gates within memristive crossbar structures, promising to improve performance and energy efficiency. Utilizing only binary values, LIM particularly excels in accelerating binary neural networks, shifting it in the focus of edge applications. Considering its potential, the impact of faults on BNNs accelerated with LIM still lacks investigation. In this paper, we propose faulty logic-in-memory (FLIM), a fault injection platform capable of executing full-fledged BNNs on LIM while injecting in-field faults. The results show that FLIM runs a single MNIST picture 66754x faster than the state of the art by offering a fine-grained fault injection methodology. △ Less

Submitted 15 February, 2023; originally announced February 2023.

arXiv:2212.00290 [pdf, other]

doi 10.1016/j.compind.2023.103885

Component Segmentation of Engineering Drawings Using Graph Convolutional Networks

Authors: Wentai Zhang, Joe Joseph, Yue Yin, Liuyue Xie, Tomotake Furuhata, Soji Yamakawa, Kenji Shimada, Levent Burak Kara

Abstract: We present a data-driven framework to automate the vectorization and machine interpretation of 2D engineering part drawings. In industrial settings, most manufacturing engineers still rely on manual reads to identify the topological and manufacturing requirements from drawings submitted by designers. The interpretation process is laborious and time-consuming, which severely inhibits the efficiency… ▽ More We present a data-driven framework to automate the vectorization and machine interpretation of 2D engineering part drawings. In industrial settings, most manufacturing engineers still rely on manual reads to identify the topological and manufacturing requirements from drawings submitted by designers. The interpretation process is laborious and time-consuming, which severely inhibits the efficiency of part quotation and manufacturing tasks. While recent advances in image-based computer vision methods have demonstrated great potential in interpreting natural images through semantic segmentation approaches, the application of such methods in parsing engineering technical drawings into semantically accurate components remains a significant challenge. The severe pixel sparsity in engineering drawings also restricts the effective featurization of image-based data-driven methods. To overcome these challenges, we propose a deep learning based framework that predicts the semantic type of each vectorized component. Taking a raster image as input, we vectorize all components through thinning, stroke tracing, and cubic bezier fitting. Then a graph of such components is generated based on the connectivity between the components. Finally, a graph convolutional neural network is trained on this graph data to identify the semantic type of each component. We test our framework in the context of semantic segmentation of text, dimension and, contour components in engineering drawings. Results show that our method yields the best performance compared to recent image, and graph-based segmentation methods. △ Less

Submitted 14 March, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

Comments: Preprint accepted to Computers in Industry

arXiv:2210.09669 [pdf, other]

Bridge trisections and Seifert solids

Authors: Jason Joseph, Jeffrey Meier, Maggie Miller, Alexander Zupan

Abstract: We adapt Seifert's algorithm for classical knots and links to the setting of tri-plane diagrams for bridge trisected surfaces in the 4-sphere. Our approach allows for the construction of a Seifert solid that is described by a Heegaard diagram. The Seifert solids produced can be assumed to have exteriors that can be built without 3-handles; in contrast, we give examples of Seifert solids (not comin… ▽ More We adapt Seifert's algorithm for classical knots and links to the setting of tri-plane diagrams for bridge trisected surfaces in the 4-sphere. Our approach allows for the construction of a Seifert solid that is described by a Heegaard diagram. The Seifert solids produced can be assumed to have exteriors that can be built without 3-handles; in contrast, we give examples of Seifert solids (not coming from our construction) whose exteriors require arbitrarily many 3-handles. We conclude with two classification results. The first shows that surfaces admitting doubly-standard shadow diagrams are unknotted. The second says that a $b$-bridge trisection in which some sector contains at least $b-1$ patches is completely decomposable, thus the corresponding surface is unknotted. This settles affirmatively a conjecture of the second and fourth authors. △ Less

Submitted 18 October, 2022; originally announced October 2022.

Comments: 23 pages, 6 figures; v1 of arXiv:2112.11557 has been divided into two papers: v2, to be posted simultaneously, and the present article, which adds an expanded discussion of Seifert solids

arXiv:2208.07760 [pdf, other]

Transmission Structured Illumination Microscopy using Tilt-mirror Assembly

Authors: Krishnendu Samanta, Azeem Ahmad, Jean-Claude Tinguely, Balpreet Singh Ahluwalia, Joby Joseph

Abstract: We present experimental demonstration of tilt-mirror assisted transmission structured illumination microscopy (tSIM) that offers a large field of view super resolution imaging. An assembly of custom-designed tilt-mirrors are employed as the illumination module where the sample is excited with the interference of two beams reflected from the opposite pair of mirror facets. Tunable frequency structu… ▽ More We present experimental demonstration of tilt-mirror assisted transmission structured illumination microscopy (tSIM) that offers a large field of view super resolution imaging. An assembly of custom-designed tilt-mirrors are employed as the illumination module where the sample is excited with the interference of two beams reflected from the opposite pair of mirror facets. Tunable frequency structured patterns are generated by changing the mirror-tilt angle and the hexagonal-symmetric arrangement is considered for the isotropic resolution in three orientations. Utilizing high numerical aperture (NA) objective in standard SIM provides super-resolution compromising with the field-of-view (FOV). Employing low NA (20X/0.4) objective lens detection, we experimentally demonstrate ~ (0.56mm x 0.35mm) size single FOV image with ~1.7- and ~2.4-fold resolution improvement (exploiting various illumination by tuning tilt-mirrors) over the diffraction limit. The results are verified both for the fluorescent beads as well as biological samples. The tSIM geometry decouples the illumination and the collection light paths consequently enabling free change of the imaging objective lens without influencing the spatial frequency of the illumination pattern that are defined by the tilt-mirrors. The large and scalable FoV supported by tSIM will find usage for applications where scanning large areas are necessary as in pathology and applications where images must be correlated both in space and time. △ Less

Submitted 16 August, 2022; originally announced August 2022.

Comments: Main (13 pages, 5 figures), Supplementary (6 figures)

arXiv:2208.03767 [pdf, other]

Class-Incremental Learning with Cross-Space Clustering and Controlled Transfer

Authors: Arjun Ashok, K J Joseph, Vineeth Balasubramanian

Abstract: In class-incremental learning, the model is expected to learn new classes continually while maintaining knowledge on previous classes. The challenge here lies in preserving the model's ability to effectively represent prior classes in the feature space, while adapting it to represent incoming new classes. We propose two distillation-based objectives for class incremental learning that leverage the… ▽ More In class-incremental learning, the model is expected to learn new classes continually while maintaining knowledge on previous classes. The challenge here lies in preserving the model's ability to effectively represent prior classes in the feature space, while adapting it to represent incoming new classes. We propose two distillation-based objectives for class incremental learning that leverage the structure of the feature space to maintain accuracy on previous classes, as well as enable learning the new classes. In our first objective, termed cross-space clustering (CSC), we propose to use the feature space structure of the previous model to characterize directions of optimization that maximally preserve the class: directions that all instances of a specific class should collectively optimize towards, and those that they should collectively optimize away from. Apart from minimizing forgetting, this indirectly encourages the model to cluster all instances of a class in the current feature space, and gives rise to a sense of herd-immunity, allowing all samples of a class to jointly combat the model from forgetting the class. Our second objective termed controlled transfer (CT) tackles incremental learning from an understudied perspective of inter-class transfer. CT explicitly approximates and conditions the current model on the semantic similarities between incrementally arriving classes and prior classes. This allows the model to learn classes in such a way that it maximizes positive forward transfer from similar prior classes, thus increasing plasticity, and minimizes negative backward transfer on dissimilar prior classes, whereby strengthening stability. We perform extensive experiments on two benchmark datasets, adding our method (CSCCT) on top of three prominent class-incremental learning methods. We observe consistent performance improvement on a variety of experimental settings. △ Less

Submitted 16 August, 2022; v1 submitted 7 August, 2022; originally announced August 2022.

Comments: Accepted at ECCV 2022; Project Page at http://cscct.github.io/

arXiv:2208.02284 [pdf, other]

Conceptual Design of the Modular Detector and Readout System for the CMB-S4 survey experiment

Authors: D. R. Barron, Z. Ahmed, J. Aguilar, A. J. Anderson, C. F. Baker, P. S. Barry, J. A. Beall, A. N. Bender, B. A. Benson, R. W. Besuner, T. W. Cecil, C. L. Chang, S. C. Chapman, G. E. Chesmore, G. Derylo, W. B. Doriese, S. M. Duff, T. Elleflot, J. P. Filippini, B. Flaugher, J. G. Gomez, P. K. Grimes, R. Gualtieri, I. Gullett, G. Haller , et al. (25 additional authors not shown)

Abstract: We present the conceptual design of the modular detector and readout system for the Cosmic Microwave Background Stage 4 (CMB-S4) ground-based survey experiment. CMB-S4 will map the cosmic microwave background (CMB) and the millimeter-wave sky to unprecedented sensitivity, using 500,000 superconducting detectors observing from Chile and Antarctica to map over 60 percent of the sky. The fundamental… ▽ More We present the conceptual design of the modular detector and readout system for the Cosmic Microwave Background Stage 4 (CMB-S4) ground-based survey experiment. CMB-S4 will map the cosmic microwave background (CMB) and the millimeter-wave sky to unprecedented sensitivity, using 500,000 superconducting detectors observing from Chile and Antarctica to map over 60 percent of the sky. The fundamental building block of the detector and readout system is a detector module package operated at 100 mK, which is connected to a readout and amplification chain that carries signals out to room temperature. It uses arrays of feedhorn-coupled orthomode transducers (OMT) that collect optical power from the sky onto dc-voltage-biased transition-edge sensor (TES) bolometers. The resulting current signal in the TESs is then amplified by a two-stage cryogenic Superconducting Quantum Interference Device (SQUID) system with a time-division multiplexer to reduce wire count, and matching room-temperature electronics to condition and transmit signals to the data acquisition system. Sensitivity and systematics requirements are being developed for the detector and readout system over a wide range of observing bands (20 to 300 GHz) and optical powers to accomplish CMB-S4's science goals. While the design incorporates the successes of previous generations of CMB instruments, CMB-S4 requires an order of magnitude more detectors than any prior experiment. This requires fabrication of complex superconducting circuits on over 10 square meters of silicon, as well as significant amounts of precision wiring, assembly and cryogenic testing. △ Less

Submitted 3 August, 2022; originally announced August 2022.

Comments: 25 pages, 15 figures, presented at and published in the proceedings of SPIE Astronomical Telescopes and Instrumentation 2022

arXiv:2208.00777 [pdf, other]

D3Former: Debiased Dual Distilled Transformer for Incremental Learning

Authors: Abdelrahman Mohamed, Rushali Grandhe, K J Joseph, Salman Khan, Fahad Khan

Abstract: In class incremental learning (CIL) setting, groups of classes are introduced to a model in each learning phase. The goal is to learn a unified model performant on all the classes observed so far. Given the recent popularity of Vision Transformers (ViTs) in conventional classification settings, an interesting question is to study their continual learning behaviour. In this work, we develop a Debia… ▽ More In class incremental learning (CIL) setting, groups of classes are introduced to a model in each learning phase. The goal is to learn a unified model performant on all the classes observed so far. Given the recent popularity of Vision Transformers (ViTs) in conventional classification settings, an interesting question is to study their continual learning behaviour. In this work, we develop a Debiased Dual Distilled Transformer for CIL dubbed $\textrm{D}^3\textrm{Former}$. The proposed model leverages a hybrid nested ViT design to ensure data efficiency and scalability to small as well as large datasets. In contrast to a recent ViT based CIL approach, our $\textrm{D}^3\textrm{Former}$ does not dynamically expand its architecture when new tasks are learned and remains suitable for a large number of incremental tasks. The improved CIL behaviour of $\textrm{D}^3\textrm{Former}$ owes to two fundamental changes to the ViT design. First, we treat the incremental learning as a long-tail classification problem where the majority samples from new classes vastly outnumber the limited exemplars available for old classes. To avoid the bias against the minority old classes, we propose to dynamically adjust logits to emphasize on retaining the representations relevant to old tasks. Second, we propose to preserve the configuration of spatial attention maps as the learning progresses across tasks. This helps in reducing catastrophic forgetting by constraining the model to retain the attention on the most discriminative regions. $\textrm{D}^3\textrm{Former}$ obtains favorable results on incremental versions of CIFAR-100, MNIST, SVHN, and ImageNet datasets. Code is available at https://tinyurl.com/d3former △ Less

Submitted 3 June, 2023; v1 submitted 25 July, 2022; originally announced August 2022.

Comments: Accepted to CLVision at CVPR 2023

arXiv:2207.11886 [pdf, other]

Deep learning based non-contact physiological monitoring in Neonatal Intensive Care Unit

Authors: Nicky Nirlipta Sahoo, Balamurali Murugesan, Ayantika Das, Srinivasa Karthik, Keerthi Ram, Steffen Leonhardt, Jayaraj Joseph, Mohanasankar Sivaprakasam

Abstract: Preterm babies in the Neonatal Intensive Care Unit (NICU) have to undergo continuous monitoring of their cardiac health. Conventional monitoring approaches are contact-based, making the neonates prone to various nosocomial infections. Video-based monitoring approaches have opened up potential avenues for contactless measurement. This work presents a pipeline for remote estimation of cardiopulmonar… ▽ More Preterm babies in the Neonatal Intensive Care Unit (NICU) have to undergo continuous monitoring of their cardiac health. Conventional monitoring approaches are contact-based, making the neonates prone to various nosocomial infections. Video-based monitoring approaches have opened up potential avenues for contactless measurement. This work presents a pipeline for remote estimation of cardiopulmonary signals from videos in NICU setup. We have proposed an end-to-end deep learning (DL) model that integrates a non-learning based approach to generate surrogate ground truth (SGT) labels for supervision, thus refraining from direct dependency on true ground truth labels. We have performed an extended qualitative and quantitative analysis to examine the efficacy of our proposed DL-based pipeline and achieved an overall average mean absolute error of 4.6 beats per minute (bpm) and root mean square error of 6.2 bpm in the estimated heart rate. △ Less

Submitted 24 July, 2022; originally announced July 2022.

arXiv:2207.11036 [pdf, other]

NISTT: A Non-Intrusive SystemC-TLM 2.0 Tracing Tool

Authors: Nils Bosbach, Lukas Jünger, Jan Moritz Joseph, Rainer Leupers

Abstract: The increasing complexity of systems-on-a-chip requires the continuous development of electronic design automation tools. Nowadays, the simulation of systems-on-a-chip using virtual platforms is common. Virtual platforms enable hardware/software co-design to shorten the time to market, offer insights into the models, and allow debugging of the simulated hardware. Profiling tools are required to im… ▽ More The increasing complexity of systems-on-a-chip requires the continuous development of electronic design automation tools. Nowadays, the simulation of systems-on-a-chip using virtual platforms is common. Virtual platforms enable hardware/software co-design to shorten the time to market, offer insights into the models, and allow debugging of the simulated hardware. Profiling tools are required to improve the usability of virtual platforms. During simulation, these tools capture data that are evaluated afterward. Those data can reveal information about the simulation itself and the software executed on the platform. This work presents the tracing tool NISTT that can profile SystemC-TLM-2.0-based virtual platforms. NISTT is implemented in a completely non-intrusive way. That means no changes in the simulation are needed, the source code of the simulation is not required, and the traced simulation does not need to contain debug symbols. The standardized SystemC application programming interface guarantees the compatibility of NISTT with other simulations. The strengths of NISTT are demonstrated in a case study. Here, NISTT is connected to a virtual platform and traces the boot process of Linux. After the simulation, the database created by NISTT is evaluated, and the results are visualized. Furthermore, the overhead of NISTT is quantified. It is shown that NISTT has only a minor influence on the overall simulation performance. △ Less

Submitted 22 July, 2022; originally announced July 2022.

Comments: PREPRINT - accepted by 30th IFIP/IEEE International Conference on Very Large Scale Integration 2022 (VLSI-SoC 2022)

arXiv:2207.10659 [pdf, other]

Novel Class Discovery without Forgetting

Authors: K J Joseph, Sujoy Paul, Gaurav Aggarwal, Soma Biswas, Piyush Rai, Kai Han, Vineeth N Balasubramanian

Abstract: Humans possess an innate ability to identify and differentiate instances that they are not familiar with, by leveraging and adapting the knowledge that they have acquired so far. Importantly, they achieve this without deteriorating the performance on their earlier learning. Inspired by this, we identify and formulate a new, pragmatic problem setting of NCDwF: Novel Class Discovery without Forgetti… ▽ More Humans possess an innate ability to identify and differentiate instances that they are not familiar with, by leveraging and adapting the knowledge that they have acquired so far. Importantly, they achieve this without deteriorating the performance on their earlier learning. Inspired by this, we identify and formulate a new, pragmatic problem setting of NCDwF: Novel Class Discovery without Forgetting, which tasks a machine learning model to incrementally discover novel categories of instances from unlabeled data, while maintaining its performance on the previously seen categories. We propose 1) a method to generate pseudo-latent representations which act as a proxy for (no longer available) labeled data, thereby alleviating forgetting, 2) a mutual-information based regularizer which enhances unsupervised discovery of novel classes, and 3) a simple Known Class Identifier which aids generalized inference when the testing data contains instances form both seen and unseen categories. We introduce experimental protocols based on CIFAR-10, CIFAR-100 and ImageNet-1000 to measure the trade-off between knowledge retention and novel class discovery. Our extensive evaluations reveal that existing models catastrophically forget previously seen categories while identifying novel categories, while our method is able to effectively balance between the competing objectives. We hope our work will attract further research into this newly identified pragmatic problem setting. △ Less

Submitted 21 July, 2022; originally announced July 2022.

Comments: Accepted to ECCV 2022

arXiv:2206.11613 [pdf, other]

EmuNoC: Hybrid Emulation for Fast and Flexible Network-on-Chip Prototy** on FPGAs

Authors: Yee Yang Tan, Felix Staudigl, Lukas Jünger, Anna Drewes, Rainer Leupers, Jan Moritz Joseph

Abstract: Networks-on-Chips (NoCs) recently became widely used, from multi-core CPUs to edge-AI accelerators. Emulation on FPGAs promises to accelerate their RTL modeling compared to slow simulations. However, realistic test stimuli are challenging to generate in hardware for diverse applications. In other words, both a fast and flexible design framework is required. The most promising solution is hybrid em… ▽ More Networks-on-Chips (NoCs) recently became widely used, from multi-core CPUs to edge-AI accelerators. Emulation on FPGAs promises to accelerate their RTL modeling compared to slow simulations. However, realistic test stimuli are challenging to generate in hardware for diverse applications. In other words, both a fast and flexible design framework is required. The most promising solution is hybrid emulation, in which parts of the design are simulated in software, and the other parts are emulated in hardware. This paper proposes a novel hybrid emulation framework called EmuNoC. We introduce a clock-synchronization method and software-only packet generation that improves the emulation speed by 36.3x to 79.3x over state-of-the-art frameworks while retaining the flexibility of a pure-software interface for stimuli simulation. We also increased the area efficiency to model up to an NoC with 169 routers on a single FPGA, while previous frameworks only achieved 64 routers. △ Less

Submitted 23 June, 2022; originally announced June 2022.

arXiv:2206.10183 [pdf]

covEcho Resource constrained lung ultrasound image analysis tool for faster triaging and active learning

Authors: **u Joseph, Mahesh Raveendranatha Panicker, Yale Tung Chen, Kesavadas Chandrasekharan, Vimal Chacko Mondy, Anoop Ayyappan, **eesh Valakkada, Kiran Vishnu Narayan

Abstract: Lung ultrasound (LUS) is possibly the only medical imaging modality which could be used for continuous and periodic monitoring of the lung. This is extremely useful in tracking the lung manifestations either during the onset of lung infection or to track the effect of vaccination on lung as in pandemics such as COVID-19. There have been many attempts in automating the classification of severity of… ▽ More Lung ultrasound (LUS) is possibly the only medical imaging modality which could be used for continuous and periodic monitoring of the lung. This is extremely useful in tracking the lung manifestations either during the onset of lung infection or to track the effect of vaccination on lung as in pandemics such as COVID-19. There have been many attempts in automating the classification of severity of lung into various classes or automatic segmentation of various LUS landmarks and manifestations. However, all these approaches are based on training static machine learning models which require a significantly clinically annotated large dataset and are computationally heavy and most of the time non-real time. In this work, a real-time light weight active learning-based approach is presented for faster triaging in COVID-19 subjects in resource constrained settings. The tool, based on the you look only once (YOLO) network, has the capability of providing the quality of images based on the identification of various LUS landmarks, artefacts and manifestations, prediction of severity of lung infection, possibility of active learning based on the feedback from clinicians or on the image quality and a summarization of the significant frames which are having high severity of infection and high image quality for further analysis. The results show that the proposed tool has a mean average precision (mAP) of 66% at an Intersection over Union (IoU) threshold of 0.5 for the prediction of LUS landmarks. The 14MB lightweight YOLOv5s network achieves 123 FPS while running in a Quadro P4000 GPU. The tool is available for usage and analysis upon request from the authors. △ Less

Submitted 21 June, 2022; originally announced June 2022.

Comments: Submitted to Elsevier CMPBUP on Dec 1, 2021

arXiv:2204.10595 [pdf, other]

Spacing Loss for Discovering Novel Categories

Authors: K J Joseph, Sujoy Paul, Gaurav Aggarwal, Soma Biswas, Piyush Rai, Kai Han, Vineeth N Balasubramanian

Abstract: Novel Class Discovery (NCD) is a learning paradigm, where a machine learning model is tasked to semantically group instances from unlabeled data, by utilizing labeled instances from a disjoint set of classes. In this work, we first characterize existing NCD approaches into single-stage and two-stage methods based on whether they require access to labeled and unlabeled data together while discoveri… ▽ More Novel Class Discovery (NCD) is a learning paradigm, where a machine learning model is tasked to semantically group instances from unlabeled data, by utilizing labeled instances from a disjoint set of classes. In this work, we first characterize existing NCD approaches into single-stage and two-stage methods based on whether they require access to labeled and unlabeled data together while discovering new classes. Next, we devise a simple yet powerful loss function that enforces separability in the latent space using cues from multi-dimensional scaling, which we refer to as Spacing Loss. Our proposed formulation can either operate as a standalone method or can be plugged into existing methods to enhance them. We validate the efficacy of Spacing Loss with thorough experimental evaluation across multiple settings on CIFAR-10 and CIFAR-100 datasets. △ Less

Submitted 22 April, 2022; originally announced April 2022.

Comments: Accepted to Continual Learning in Computer Vision Workshop (CLVision) at CVPR 2022

arXiv:2204.01501 [pdf, other]

X-Fault: Impact of Faults on Binary Neural Networks in Memristor-Crossbar Arrays with Logic-in-Memory Computation

Authors: Felix Staudigl, Karl J. X. Sturm, Maximilian Bartel, Thorben Fetz, Dominik Sisejkovic, Jan Moritz Joseph, Leticia Bolzani Pöhls, Rainer Leupers

Abstract: Memristor-based crossbar arrays represent a promising emerging memory technology to replace conventional memories by offering a high density and enabling computing-in-memory (CIM) paradigms. While analog computing provides the best performance, non-idealities and ADC/DAC conversion limit memristor-based CIM. Logic-in-Memory (LIM) presents another flavor of CIM, in which the memristors are used in… ▽ More Memristor-based crossbar arrays represent a promising emerging memory technology to replace conventional memories by offering a high density and enabling computing-in-memory (CIM) paradigms. While analog computing provides the best performance, non-idealities and ADC/DAC conversion limit memristor-based CIM. Logic-in-Memory (LIM) presents another flavor of CIM, in which the memristors are used in a binary manner to implement logic gates. Since binary neural networks (BNNs) use binary logic gates as the dominant operation, they can benefit from the massively parallel execution of binary operations and better resilience to variations of the memristors. Although conventional neural networks have been thoroughly investigated, the impact of faults on memristor-based BNNs remains unclear. Therefore, we analyze the impact of faults on logic gates in memristor-based crossbar arrays for BNNs. We propose a simulation framework that simulates different traditional faults to examine the accuracy loss of BNNs on memristive crossbar arrays. In addition, we compare different logic families based on the robustness and feasibility to accelerate AI applications. △ Less

Submitted 4 April, 2022; originally announced April 2022.

arXiv:2203.14952 [pdf, other]

Energy-based Latent Aligner for Incremental Learning

Authors: K J Joseph, Salman Khan, Fahad Shahbaz Khan, Rao Muhammad Anwer, Vineeth N Balasubramanian

Abstract: Deep learning models tend to forget their earlier knowledge while incrementally learning new tasks. This behavior emerges because the parameter updates optimized for the new tasks may not align well with the updates suitable for older tasks. The resulting latent representation mismatch causes forgetting. In this work, we propose ELI: Energy-based Latent Aligner for Incremental Learning, which firs… ▽ More Deep learning models tend to forget their earlier knowledge while incrementally learning new tasks. This behavior emerges because the parameter updates optimized for the new tasks may not align well with the updates suitable for older tasks. The resulting latent representation mismatch causes forgetting. In this work, we propose ELI: Energy-based Latent Aligner for Incremental Learning, which first learns an energy manifold for the latent representations such that previous task latents will have low energy and the current task latents have high energy values. This learned manifold is used to counter the representational shift that happens during incremental learning. The implicit regularization that is offered by our proposed methodology can be used as a plug-and-play module in existing incremental learning methodologies. We validate this through extensive evaluation on CIFAR-100, ImageNet subset, ImageNet 1k and Pascal VOC datasets. We observe consistent improvement when ELI is added to three prominent methodologies in class-incremental learning, across multiple incremental settings. Further, when added to the state-of-the-art incremental object detector, ELI provides over 5% improvement in detection accuracy, corroborating its effectiveness and complementary advantage to existing art. △ Less

Submitted 28 March, 2022; originally announced March 2022.

Comments: To appear in CVPR 2022. Code is available in https://github.com/JosephKJ/ELI

arXiv:2203.08024 [pdf, other]

Snowmass 2021 CMB-S4 White Paper

Authors: Kevork Abazajian, Arwa Abdulghafour, Graeme E. Addison, Peter Adshead, Zeeshan Ahmed, Marco Ajello, Daniel Akerib, Steven W. Allen, David Alonso, Marcelo Alvarez, Mustafa A. Amin, Mandana Amiri, Adam Anderson, Behzad Ansarinejad, Melanie Archipley, Kam S. Arnold, Matt Ashby, Han Aung, Carlo Baccigalupi, Carina Baker, Abhishek Bakshi, Debbie Bard, Denis Barkats, Darcy Barron, Peter S. Barry , et al. (331 additional authors not shown)

Abstract: This Snowmass 2021 White Paper describes the Cosmic Microwave Background Stage 4 project CMB-S4, which is designed to cross critical thresholds in our understanding of the origin and evolution of the Universe, from the highest energies at the dawn of time through the growth of structure to the present day. We provide an overview of the science case, the technical design, and project plan. This Snowmass 2021 White Paper describes the Cosmic Microwave Background Stage 4 project CMB-S4, which is designed to cross critical thresholds in our understanding of the origin and evolution of the Universe, from the highest energies at the dawn of time through the growth of structure to the present day. We provide an overview of the science case, the technical design, and project plan. △ Less

Submitted 15 March, 2022; originally announced March 2022.

Comments: Contribution to Snowmass 2021. arXiv admin note: substantial text overlap with arXiv:1908.01062, arXiv:1907.04473

arXiv:2201.03954 [pdf, other]

The Dataset Nutrition Label (2nd Gen): Leveraging Context to Mitigate Harms in Artificial Intelligence

Authors: Kasia S. Chmielinski, Sarah Newman, Matt Taylor, Josh Joseph, Kemi Thomas, Jessica Yurkofsky, Yue Chelsea Qiu

Abstract: As the production of and reliance on datasets to produce automated decision-making systems (ADS) increases, so does the need for processes for evaluating and interrogating the underlying data. After launching the Dataset Nutrition Label in 2018, the Data Nutrition Project has made significant updates to the design and purpose of the Label, and is launching an updated Label in late 2020, which is p… ▽ More As the production of and reliance on datasets to produce automated decision-making systems (ADS) increases, so does the need for processes for evaluating and interrogating the underlying data. After launching the Dataset Nutrition Label in 2018, the Data Nutrition Project has made significant updates to the design and purpose of the Label, and is launching an updated Label in late 2020, which is previewed in this paper. The new Label includes context-specific Use Cases &Alerts presented through an updated design and user interface targeted towards the data scientist profile. This paper discusses the harm and bias from underlying training data that the Label is intended to mitigate, the current state of the work including new datasets being labeled, new and existing challenges, and further directions of the work, as well as Figures previewing the new label. △ Less

Submitted 10 March, 2022; v1 submitted 10 January, 2022; originally announced January 2022.

arXiv:2112.11557 [pdf, other]

doi 10.2140/pjm.2022.319.343

Bridge trisections and classical knotted surface theory

Authors: Jason Joseph, Jeffrey Meier, Maggie Miller, Alexander Zupan

Abstract: We seek to connect ideas in the theory of bridge trisections with other well-studied facets of classical knotted surface theory. First, we show how the normal Euler number can be computed from a tri-plane diagram, and we use this to give a trisection-theoretic proof of the Whitney-Massey Theorem, which bounds the possible values of this number in terms of the Euler characteristic. Second, we descr… ▽ More We seek to connect ideas in the theory of bridge trisections with other well-studied facets of classical knotted surface theory. First, we show how the normal Euler number can be computed from a tri-plane diagram, and we use this to give a trisection-theoretic proof of the Whitney-Massey Theorem, which bounds the possible values of this number in terms of the Euler characteristic. Second, we describe in detail how to compute the fundamental group and related invariants from a tri-plane diagram, and we use this, together with an analysis of bridge trisections of ribbon surfaces, to produce an infinite family of knotted spheres that admit non-isotopic bridge trisections of minimal complexity. △ Less

Submitted 18 October, 2022; v1 submitted 21 December, 2021; originally announced December 2021.

Comments: v1 has been divided into two papers: the present article and "Bridge trisections and Seifert solids," which will be posted simultaneously; 29 pages, 11 figures

Journal ref: Pacific J. Math. 319 (2022) 343-369

arXiv:2112.02425 [pdf, other]

doi 10.1007/s10909-022-02796-8

Low Noise Frequency Domain Multiplexing of TES Bolometers using Sub-kelvin SQUIDs

Authors: Tucker Elleflot, Aritoki Suzuki, Kam Arnold, Chris Bebek, Robin H. Cantor, Kevin T. Crowley, John Groh, Tijmen de Haan, Amber Hornsby, John Joseph, Adrian T. Lee, Tiffany Liu, Joshua Montgomery, Megan Russell, Qingyang Yu

Abstract: Digital Frequency-Domain Multiplexing (DfMux) is a technique that uses MHz superconducting resonators and Superconducting Quantum Interference Device (SQUID) arrays to read out sets of Transition Edge Sensors. DfMux has been used by several Cosmic Microwave Background experiments, including most recently POLARBEAR-2 and SPT-3G with multiplexing factors as high as 68, and is the baseline readout te… ▽ More Digital Frequency-Domain Multiplexing (DfMux) is a technique that uses MHz superconducting resonators and Superconducting Quantum Interference Device (SQUID) arrays to read out sets of Transition Edge Sensors. DfMux has been used by several Cosmic Microwave Background experiments, including most recently POLARBEAR-2 and SPT-3G with multiplexing factors as high as 68, and is the baseline readout technology for the planned satellite mission LiteBIRD. Here, we present recent work focused on improving DfMux readout noise, reducing parasitic impedance, and improving sensor operation. We have achieved a substantial reduction in stray impedance by integrating the sensors, resonators, and SQUID array onto a single carrier board operated at 250 mK. This also drastically simplifies the packaging of the cryogenic components and leads to better-controlled crosstalk. We demonstrate a low readout noise level of 8.6 pA/Hz$^{-1/2}$, which was made possible by operating the SQUID array at a reduced temperature and with a low dynamic impedance. This is a factor of two improvement compared to the achieved readout noise level in currently operating Cosmic Microwave Background experiments using DfMux and represents a critical step toward maturation of the technology for the next generation of instruments. △ Less

Submitted 4 December, 2021; originally announced December 2021.

arXiv:2112.01513 [pdf, other]

OW-DETR: Open-world Detection Transformer

Authors: Akshita Gupta, Sanath Narayan, K J Joseph, Salman Khan, Fahad Shahbaz Khan, Mubarak Shah

Abstract: Open-world object detection (OWOD) is a challenging computer vision problem, where the task is to detect a known set of object categories while simultaneously identifying unknown objects. Additionally, the model must incrementally learn new classes that become known in the next training episodes. Distinct from standard object detection, the OWOD setting poses significant challenges for generating… ▽ More Open-world object detection (OWOD) is a challenging computer vision problem, where the task is to detect a known set of object categories while simultaneously identifying unknown objects. Additionally, the model must incrementally learn new classes that become known in the next training episodes. Distinct from standard object detection, the OWOD setting poses significant challenges for generating quality candidate proposals on potentially unknown objects, separating the unknown objects from the background and detecting diverse unknown objects. Here, we introduce a novel end-to-end transformer-based framework, OW-DETR, for open-world object detection. The proposed OW-DETR comprises three dedicated components namely, attention-driven pseudo-labeling, novelty classification and objectness scoring to explicitly address the aforementioned OWOD challenges. Our OW-DETR explicitly encodes multi-scale contextual information, possesses less inductive bias, enables knowledge transfer from known classes to the unknown class and can better discriminate between unknown objects and background. Comprehensive experiments are performed on two benchmarks: MS-COCO and PASCAL VOC. The extensive ablations reveal the merits of our proposed contributions. Further, our model outperforms the recently introduced OWOD approach, ORE, with absolute gains ranging from 1.8% to 3.3% in terms of unknown recall on MS-COCO. In the case of incremental object detection, OW-DETR outperforms the state-of-the-art for all settings on PASCAL VOC. Our code is available at https://github.com/akshitac8/OW-DETR. △ Less

Submitted 4 April, 2022; v1 submitted 2 December, 2021; originally announced December 2021.

Comments: 16 pages, CVPR 2022 accepted

arXiv:2112.01087 [pdf, ps, other]

doi 10.23919/DATE54114.2022.9774651

NeuroHammer: Inducing Bit-Flips in Memristive Crossbar Memories

Authors: Felix Staudigl, Hazem Al Indari, Daniel Schön, Dominik Sisejkovic, Farhad Merchant, Jan Moritz Joseph, Vikas Rana, Stephan Menzel, Rainer Leupers

Abstract: Emerging non-volatile memory (NVM) technologies offer unique advantages in energy efficiency, latency, and features such as computing-in-memory. Consequently, emerging NVM technologies are considered an ideal substrate for computation and storage in future-generation neuromorphic platforms. These technologies need to be evaluated for fundamental reliability and security issues. In this paper, we p… ▽ More Emerging non-volatile memory (NVM) technologies offer unique advantages in energy efficiency, latency, and features such as computing-in-memory. Consequently, emerging NVM technologies are considered an ideal substrate for computation and storage in future-generation neuromorphic platforms. These technologies need to be evaluated for fundamental reliability and security issues. In this paper, we present \emph{NeuroHammer}, a security threat in ReRAM crossbars caused by thermal crosstalk between memory cells. We demonstrate that bit-flips can be deliberately induced in ReRAM devices in a crossbar by systematically writing adjacent memory cells. A simulation flow is developed to evaluate NeuroHammer and the impact of physical parameters on the effectiveness of the attack. Finally, we discuss the security implications in the context of possible attack scenarios. △ Less

Submitted 6 December, 2021; v1 submitted 2 December, 2021; originally announced December 2021.

arXiv:2111.09233 [pdf, other]

Bridge numbers and meridional ranks of knotted surfaces and welded knots

Authors: Jason Joseph, Puttipong Pongtanapaisan

Abstract: The Meridional Rank Conjecture asks whether the bridge number of a knot in $S^3$ is equal to the minimal number of meridians needed to generate the fundamental group of its complement. In this paper we investigate the analogous conjecture for knotted surfaces in $S^4$. Towards this end, we give a construction to produce classical knots with quotients sending meridians to elements of any finite ord… ▽ More The Meridional Rank Conjecture asks whether the bridge number of a knot in $S^3$ is equal to the minimal number of meridians needed to generate the fundamental group of its complement. In this paper we investigate the analogous conjecture for knotted surfaces in $S^4$. Towards this end, we give a construction to produce classical knots with quotients sending meridians to elements of any finite order and which detect their meridional ranks. We establish the equality of bridge number and meridional rank for these knots and knotted spheres obtained from them by twist-spinning. On the other hand, we show that the meridional rank of knotted spheres is not additive under connected sum, so that either bridge number also collapses, or meridional rank is not equal to bridge number for knotted spheres. We also show a relationship between the bridge numbers of welded knots and ribbon tori using the Tube map, and give applications to bridge trisections of knotted surfaces. △ Less

Submitted 6 February, 2023; v1 submitted 17 November, 2021; originally announced November 2021.

arXiv:2108.08295 [pdf, other]

AIRCHITECT: Learning Custom Architecture Design and Map** Space

Authors: Ananda Samajdar, Jan Moritz Joseph, Matthew Denton, Tushar Krishna

Abstract: Design space exploration is an important but costly step involved in the design/deployment of custom architectures to squeeze out maximum possible performance and energy efficiency. Conventionally, optimizations require iterative sampling of the design space using simulation or heuristic tools. In this paper we investigate the possibility of learning the optimization task using machine learning an… ▽ More Design space exploration is an important but costly step involved in the design/deployment of custom architectures to squeeze out maximum possible performance and energy efficiency. Conventionally, optimizations require iterative sampling of the design space using simulation or heuristic tools. In this paper we investigate the possibility of learning the optimization task using machine learning and hence using the learnt model to predict optimal parameters for the design and map** space of custom architectures, bypassing any exploration step. We use three case studies involving the optimal array design, SRAM buffer sizing, map**, and schedule determination for systolic-array-based custom architecture design and map** space. Within the purview of these case studies, we show that it is possible to capture the design space and train a model to "generalize" prediction the optimal design and map** parameters when queried with workload and design constraints. We perform systematic design-aware and statistical analysis of the optimization space for our case studies and highlight the patterns in the design space. We formulate the architecture design and map** as a machine learning problem that allows us to leverage existing ML models for training and inference. We design and train a custom network architecture called AIRCHITECT, which is capable of learning the architecture design space with as high as 94.3% test accuracy and predicting optimal configurations which achieve on average (GeoMean) of 99.9% the best possible performance on a test dataset with $10^5$ GEMM workloads. △ Less

Submitted 16 August, 2021; originally announced August 2021.

arXiv:2107.09915 [pdf]

doi 10.1038/s41467-021-27360-y

Influence of Shape Resonances on the Angular Dependence of Molecular Photoionization Delays

Authors: Fabian Holzmeier, Jennifer Joseph, Jean-Christophe Houver, Mogens Lebech, Danielle Dowek, Robert R. Lucchese

Abstract: Characterizing time delays in molecular photoionization as a function of the ejected electron emission direction relative to the orientation of the molecule and the light polarization axis pro-vides unprecedented insights into the attosecond dynamics induced by extreme ultraviolet or X-ray one-photon absorption, including the role of electronic correlation and continuum resonant states. Here, we r… ▽ More Characterizing time delays in molecular photoionization as a function of the ejected electron emission direction relative to the orientation of the molecule and the light polarization axis pro-vides unprecedented insights into the attosecond dynamics induced by extreme ultraviolet or X-ray one-photon absorption, including the role of electronic correlation and continuum resonant states. Here, we report completely resolved experimental and computational angular depend-ence of single-photon ionization delays in NO molecules across a shape resonance, relying on synchrotron radiation and time independent ab initio calculations. The angle-dependent time delay variations of few hundreds of attoseconds, resulting from the interference of the resonant and non-resonant contributions to the dynamics of the ejected electron, are well described using a multichannel Fano model where the resonance time delay is angle-independent. Comparing these results with the same resonance computed in e-NO+ scattering highlights the connection of photoionization delays with Wigner scattering time delays. △ Less

Submitted 23 November, 2021; v1 submitted 21 July, 2021; originally announced July 2021.

Comments: 21 pages, 10 figures, supporting material included

arXiv:2103.07188 [pdf, other]

doi 10.3847/1538-4357/abfd33

Study of Temporal and Spectral variability for Blazar PKS 1830-211 with Multi-Wavelength Data

Authors: Jayant Abhir, Raj Prince, Jophin Joseph, Debanjan Bose, Nayantara Gupta

Abstract: A study of the gravitationally lensed blazar PKS 1830-211 was carried out using multi waveband data collected by Fermi-LAT, Swift-XRT and Swift-UVOT telescopes between MJD 58400 to MJD 58800 (9 Oct 2018 to 13 Nov 2019). Flaring states were identified by analysing the gamma-ray light curve. Simultaneous multi-waveband SED were obtained for those flaring periods. A cross-correlation analysis of the… ▽ More A study of the gravitationally lensed blazar PKS 1830-211 was carried out using multi waveband data collected by Fermi-LAT, Swift-XRT and Swift-UVOT telescopes between MJD 58400 to MJD 58800 (9 Oct 2018 to 13 Nov 2019). Flaring states were identified by analysing the gamma-ray light curve. Simultaneous multi-waveband SED were obtained for those flaring periods. A cross-correlation analysis of the multi-waveband data was carried out, which suggested a common origin of the gamma-ray and X-ray emission. The broadband emission mechanism was studied by modelling the SED using a leptonic model. Physical parameters of the blazar were estimated from the broadband SED modelling. The blazar PKS 1830-211 is gravitationally lensed by at least two galaxies and has been extensively studied in the literature because of this property. The self-correlation of the gamma-ray light curve was studied to identify the signature of lensing, but no conclusive evidence of correlation was found at the expected time delay of 26 days. △ Less

Submitted 13 July, 2021; v1 submitted 12 March, 2021; originally announced March 2021.

Comments: 17 pages, 8 figures. Updated to match the published text based on referee's suggestions and minor corrections. Results remain the same

Journal ref: The Astrophysical Journal, Volume 915 Number 1 Page 26. Published 30 June 2021

arXiv:2103.02603 [pdf, other]

Towards Open World Object Detection

Authors: K J Joseph, Salman Khan, Fahad Shahbaz Khan, Vineeth N Balasubramanian

Abstract: Humans have a natural instinct to identify unknown object instances in their environments. The intrinsic curiosity about these unknown instances aids in learning about them, when the corresponding knowledge is eventually available. This motivates us to propose a novel computer vision problem called: `Open World Object Detection', where a model is tasked to: 1) identify objects that have not been i… ▽ More Humans have a natural instinct to identify unknown object instances in their environments. The intrinsic curiosity about these unknown instances aids in learning about them, when the corresponding knowledge is eventually available. This motivates us to propose a novel computer vision problem called: `Open World Object Detection', where a model is tasked to: 1) identify objects that have not been introduced to it as `unknown', without explicit supervision to do so, and 2) incrementally learn these identified unknown categories without forgetting previously learned classes, when the corresponding labels are progressively received. We formulate the problem, introduce a strong evaluation protocol and provide a novel solution, which we call ORE: Open World Object Detector, based on contrastive clustering and energy based unknown identification. Our experimental evaluation and ablation studies analyze the efficacy of ORE in achieving Open World objectives. As an interesting by-product, we find that identifying and characterizing unknown instances helps to reduce confusion in an incremental object detection setting, where we achieve state-of-the-art performance, with no extra methodological effort. We hope that our work will attract further research into this newly identified, yet crucial research direction. △ Less

Submitted 9 May, 2021; v1 submitted 3 March, 2021; originally announced March 2021.

Comments: To appear in CVPR 2021 as an ORAL paper. Code is available in https://github.com/JosephKJ/OWOD

arXiv:2102.05824 [pdf, other]

Reproducibility Report: La-MAML: Look-ahead Meta Learning for Continual Learning

Authors: Joel Joseph, Alex Gu

Abstract: The Continual Learning (CL) problem involves performing well on a sequence of tasks under limited compute. Current algorithms in the domain are either slow, offline or sensitive to hyper-parameters. La-MAML, an optimization-based meta-learning algorithm claims to be better than other replay-based, prior-based and meta-learning based approaches. According to the MER paper [1], metrics to measure pe… ▽ More The Continual Learning (CL) problem involves performing well on a sequence of tasks under limited compute. Current algorithms in the domain are either slow, offline or sensitive to hyper-parameters. La-MAML, an optimization-based meta-learning algorithm claims to be better than other replay-based, prior-based and meta-learning based approaches. According to the MER paper [1], metrics to measure performance in the continual learning arena are Retained Accuracy (RA) and Backward Transfer-Interference (BTI). La-MAML claims to perform better in these values when compared to the SOTA in the domain. This is the main claim of the paper, which we shall be verifying in this report. △ Less

Submitted 20 May, 2021; v1 submitted 10 February, 2021; originally announced February 2021.

arXiv:2012.12563 [pdf, other]

Architecture, Dataflow and Physical Design Implications of 3D-ICs for DNN-Accelerators

Authors: Jan Moritz Joseph, Ananda Samajdar, Lingjun Zhu, Rainer Leupers, Sung-Kyu Lim, Thilo Pionteck, Tushar Krishna

Abstract: The everlasting demand for higher computing power for deep neural networks (DNNs) drives the development of parallel computing architectures. 3D integration, in which chips are integrated and connected vertically, can further increase performance because it introduces another level of spatial parallelism. Therefore, we analyze dataflows, performance, area, power and temperature of such 3D-DNN-acce… ▽ More The everlasting demand for higher computing power for deep neural networks (DNNs) drives the development of parallel computing architectures. 3D integration, in which chips are integrated and connected vertically, can further increase performance because it introduces another level of spatial parallelism. Therefore, we analyze dataflows, performance, area, power and temperature of such 3D-DNN-accelerators. Monolithic and TSV-based stacked 3D-ICs are compared against 2D-ICs. We identify workload properties and architectural parameters for efficient 3D-ICs and achieve up to 9.14x speedup of 3D vs. 2D. We discuss area-performance trade-offs. We demonstrate applicability as the 3D-IC draws similar power as 2D-ICs and is not thermal limited. △ Less

Submitted 18 February, 2021; v1 submitted 23 December, 2020; originally announced December 2020.

arXiv:2011.00129 [pdf]

Data Acquisition and Signal Processing for the Gamma Ray Energy Tracking Array (GRETA)

Authors: Thorsten Stezelberger, John Joseph, Vamsi Vytla, Sergio Zimmermann

Abstract: The Gamma Ray Energy Tracking Array (GRETA) is a 4-π detector system, currently under development, capable of determining energy, timing and tracking of multiple gamma-ray interactions inside germanium crystals as demonstrated in the Gamma Ray Energy Tracking In-Beam Array (GRETINA). Charge sensitive amplifiers instrument the crystals and their outputs are converted using analog to digital convert… ▽ More The Gamma Ray Energy Tracking Array (GRETA) is a 4-π detector system, currently under development, capable of determining energy, timing and tracking of multiple gamma-ray interactions inside germanium crystals as demonstrated in the Gamma Ray Energy Tracking In-Beam Array (GRETINA). Charge sensitive amplifiers instrument the crystals and their outputs are converted using analog to digital converters for real-time digital processing. In this paper, we will present the design of the detector system and data acquisition. We will describe the real time components of the digital signal-processing path used to find the energy and timing of the gamma rays at low and high rates. We will describe the performance of the data acquisition system hardware and firmware and compare with the requirements. △ Less

Submitted 30 October, 2020; originally announced November 2020.

Comments: Conference Record Real Time Conference 2020

arXiv:2010.00352 [pdf, other]

Meta-Consolidation for Continual Learning

Authors: K J Joseph, Vineeth N Balasubramanian

Abstract: The ability to continuously learn and adapt itself to new tasks, without losing grasp of already acquired knowledge is a hallmark of biological learning systems, which current deep learning systems fall short of. In this work, we present a novel methodology for continual learning called MERLIN: Meta-Consolidation for Continual Learning. We assume that weights of a neural network $\boldsymbol ψ$,… ▽ More The ability to continuously learn and adapt itself to new tasks, without losing grasp of already acquired knowledge is a hallmark of biological learning systems, which current deep learning systems fall short of. In this work, we present a novel methodology for continual learning called MERLIN: Meta-Consolidation for Continual Learning. We assume that weights of a neural network $\boldsymbol ψ$, for solving task $\boldsymbol t$, come from a meta-distribution $p(\boldsymbol{ψ|t})$. This meta-distribution is learned and consolidated incrementally. We operate in the challenging online continual learning setting, where a data point is seen by the model only once. Our experiments with continual learning benchmarks of MNIST, CIFAR-10, CIFAR-100 and Mini-ImageNet datasets show consistent improvement over five baselines, including a recent state-of-the-art, corroborating the promise of MERLIN. △ Less

Submitted 1 October, 2020; originally announced October 2020.

Comments: Accepted to NeurIPS 2020

arXiv:2009.08220 [pdf, other]

doi 10.1093/mnras/staa3639

Multi Frequency Temporal and Spectral variability study of Blazar PKS 1424-418

Authors: Jayant Abhir, Jophin Joseph, Sonal R Patel, Debanjan Bose

Abstract: A study of blazar PKS 1424-418 was carried out using multi waveband data collected by Fermi-LAT, Swift-XRT, Swift-UVOT and SMARTS telescopes between MJD 56000 to MJD 56600 (14 Mar 2012 to 4 Nov 2013). Two flaring episodes were identified by analysing the gamma ray light curve. Simultaneous multi waveband Spectral Energy Distributions (SED) were obtained for those two flaring periods. A cross-corre… ▽ More A study of blazar PKS 1424-418 was carried out using multi waveband data collected by Fermi-LAT, Swift-XRT, Swift-UVOT and SMARTS telescopes between MJD 56000 to MJD 56600 (14 Mar 2012 to 4 Nov 2013). Two flaring episodes were identified by analysing the gamma ray light curve. Simultaneous multi waveband Spectral Energy Distributions (SED) were obtained for those two flaring periods. A cross-correlation analysis of IR-Optical and $γ$-ray data suggested the origin of these emissions from the same region. We have set a lower limit for the Doppler factor using the highest energy photon observed from this source during the flaring periods, which should be $>$12. The broadband emission mechanism was studied by modelling the SED using leptonic emission mechanism. △ Less

Submitted 15 January, 2021; v1 submitted 17 September, 2020; originally announced September 2020.

Comments: Published in MNRAS, Volume 501, Issue 2, Feb 2021. Pages 2504-2511. Updated with suggestions and corrections pointed out by the Referee. Results remain unchanged. 12 pages, 6 figures

Journal ref: Monthly Notices of the Royal Astronomical Society, Volume 501, Issue 2, Feb 2021, Pages 2504-2511

arXiv:2009.06420 [pdf, other]

Completely Self-Supervised Crowd Counting via Distribution Matching

Authors: Deepak Babu Sam, Abhinav Agarwalla, Jimmy Joseph, Vishwanath A. Sindagi, R. Venkatesh Babu, Vishal M. Patel

Abstract: Dense crowd counting is a challenging task that demands millions of head annotations for training models. Though existing self-supervised approaches could learn good representations, they require some labeled data to map these features to the end task of density estimation. We mitigate this issue with the proposed paradigm of complete self-supervision, which does not need even a single labeled ima… ▽ More Dense crowd counting is a challenging task that demands millions of head annotations for training models. Though existing self-supervised approaches could learn good representations, they require some labeled data to map these features to the end task of density estimation. We mitigate this issue with the proposed paradigm of complete self-supervision, which does not need even a single labeled image. The only input required to train, apart from a large set of unlabeled crowd images, is the approximate upper limit of the crowd count for the given dataset. Our method dwells on the idea that natural crowds follow a power law distribution, which could be leveraged to yield error signals for backpropagation. A density regressor is first pretrained with self-supervision and then the distribution of predictions is matched to the prior by optimizing Sinkhorn distance between the two. Experiments show that this results in effective learning of crowd features and delivers significant counting performance. Furthermore, we establish the superiority of our method in less data setting as well. The code and models for our approach is available at https://github.com/val-iisc/css-ccnn. △ Less

Submitted 14 September, 2020; originally announced September 2020.

arXiv:2007.13244 [pdf, other]

doi 10.1112/topo.12209

Unknotting numbers of 2-spheres in the 4-sphere

Authors: Jason Joseph, Michael Klug, Benjamin Ruppik, Hannah Schwartz

Abstract: We compare two naturally arising notions of unknotting number for 2-spheres in the 4-sphere: namely, the minimal number of 1-handle stabilizations needed to obtain an unknotted surface, and the minimal number of Whitney moves required in a regular homotopy to the unknotted 2-sphere. We refer to these invariants as the stabilization number and the Casson-Whitney number of the sphere, respectively.… ▽ More We compare two naturally arising notions of unknotting number for 2-spheres in the 4-sphere: namely, the minimal number of 1-handle stabilizations needed to obtain an unknotted surface, and the minimal number of Whitney moves required in a regular homotopy to the unknotted 2-sphere. We refer to these invariants as the stabilization number and the Casson-Whitney number of the sphere, respectively. Using both algebraic and geometric techniques, we show that the stabilization number is bounded above by one more than the Casson-Whitney number. We also provide explicit families of spheres for which these invariants are equal, as well as families for which they are distinct. Furthermore, we give additional bounds for both invariants, concrete examples of their non-additivity, and applications to classical unknotting number of 1-knots. △ Less

Submitted 21 September, 2021; v1 submitted 26 July, 2020; originally announced July 2020.

Comments: 29 pages, 22 figures; v2 is the final draft which has been accepted for publication in Journal of Topology; v2 includes improvements to the exposition, the numbering of the theorems in the introduction and in some of the subsequent sections has changed

Report number: MPIM-Bonn-2020 MSC Class: 57K45 (Primary) 57K10; 57K40; 57R42; 57R52 (Secondary)

Journal ref: Journal of Topology, 14.4 (2021) 1321-1350

Showing 1–50 of 160 results for author: Joseph, J