Search | arXiv e-print repository

Advancing Neuromorphic Computing: Mixed-Signal Design Techniques Leveraging Brain Code Units and Fundamental Code Units

Authors: Murat Isik, Sols Miziev, Wiktoria Pawlak, Newton Howard

Abstract: This paper introduces a groundbreaking digital neuromorphic architecture that innovatively integrates Brain Code Unit (BCU) and Fundamental Code Unit (FCU) using mixedsignal design methodologies. Leveraging open-source datasets and the latest advances in materials science, our research focuses on enhancing the computational efficiency, accuracy, and adaptability of neuromorphic systems. The core o… ▽ More This paper introduces a groundbreaking digital neuromorphic architecture that innovatively integrates Brain Code Unit (BCU) and Fundamental Code Unit (FCU) using mixedsignal design methodologies. Leveraging open-source datasets and the latest advances in materials science, our research focuses on enhancing the computational efficiency, accuracy, and adaptability of neuromorphic systems. The core of our approach lies in harmonizing the precision and scalability of digital systems with the robustness and energy efficiency of analog processing. Through experimentation, we demonstrate the effectiveness of our system across various metrics. The BCU achieved an accuracy of 88.0% and a power efficiency of 20.0 GOP/s/W, while the FCU recorded an accuracy of 86.5% and a power efficiency of 18.5 GOP/s/W. Our mixed-signal design approach significantly improved latency and throughput, achieving a latency as low as 0.75 ms and throughput up to 213 TOP/s. These results firmly establish the potential of our architecture in neuromorphic computing, providing a solid foundation for future developments in this domain. Our study underscores the feasibility of mixedsignal neuromorphic systems and their promise in advancing the field, particularly in applications requiring high efficiency and adaptability △ Less

Submitted 18 March, 2024; originally announced March 2024.

Comments: Accepted at 2024 International Joint Conference on Neural Networks

arXiv:2401.12055 [pdf, other]

NEUROSEC: FPGA-Based Neuromorphic Audio Security

Authors: Murat Isik, Hiruna Vishwamith, Yusuf Sur, Kayode Inadagbo, I. Can Dikmen

Abstract: Neuromorphic systems, inspired by the complexity and functionality of the human brain, have gained interest in academic and industrial attention due to their unparalleled potential across a wide range of applications. While their capabilities herald innovation, it is imperative to underscore that these computational paradigms, analogous to their traditional counterparts, are not impervious to secu… ▽ More Neuromorphic systems, inspired by the complexity and functionality of the human brain, have gained interest in academic and industrial attention due to their unparalleled potential across a wide range of applications. While their capabilities herald innovation, it is imperative to underscore that these computational paradigms, analogous to their traditional counterparts, are not impervious to security threats. Although the exploration of neuromorphic methodologies for image and video processing has been rigorously pursued, the realm of neuromorphic audio processing remains in its early stages. Our results highlight the robustness and precision of our FPGA-based neuromorphic system. Specifically, our system showcases a commendable balance between desired signal and background noise, efficient spike rate encoding, and unparalleled resilience against adversarial attacks such as FGSM and PGD. A standout feature of our framework is its detection rate of 94%, which, when compared to other methodologies, underscores its greater capability in identifying and mitigating threats within 5.39 dB, a commendable SNR ratio. Furthermore, neuromorphic computing and hardware security serve many sensor domains in mission-critical and privacy-preserving applications. △ Less

Submitted 22 January, 2024; originally announced January 2024.

Comments: Audio processing, FPGA, Hardware Security, Neuromorphic Computing

arXiv:2311.15006 [pdf, other]

A Survey Examining Neuromorphic Architecture in Space and Challenges from Radiation

Authors: Jonathan Naoukin, Murat Isik, Karn Tiwari

Abstract: Inspired by the human brain's structure and function, neuromorphic computing has emerged as a promising approach for develo** energy-efficient and powerful computing systems. Neuromorphic computing offers significant processing speed and power consumption advantages in aerospace applications. These two factors are crucial for real-time data analysis and decision-making. However, the harsh space… ▽ More Inspired by the human brain's structure and function, neuromorphic computing has emerged as a promising approach for develo** energy-efficient and powerful computing systems. Neuromorphic computing offers significant processing speed and power consumption advantages in aerospace applications. These two factors are crucial for real-time data analysis and decision-making. However, the harsh space environment, particularly with the presence of radiation, poses significant challenges to the reliability and performance of these computing systems. This paper comprehensively surveys the integration of radiation-resistant neuromorphic computing systems in aerospace applications. We explore the challenges posed by space radiation, review existing solutions and developments, present case studies of neuromorphic computing systems used in space applications, discuss future directions, and discuss the potential benefits of this technology in future space missions. △ Less

Submitted 25 November, 2023; originally announced November 2023.

Comments: Submitted to IEEE Journal on Miniaturization for Air and Space Systems

arXiv:2311.12449 [pdf, other]

HPCNeuroNet: Advancing Neuromorphic Audio Signal Processing with Transformer-Enhanced Spiking Neural Networks

Authors: Murat Isik, Hiruna Vishwamith, Kayode Inadagbo, I. Can Dikmen

Abstract: This paper presents a novel approach to neuromorphic audio processing by integrating the strengths of Spiking Neural Networks (SNNs), Transformers, and high-performance computing (HPC) into the HPCNeuroNet architecture. Utilizing the Intel N-DNS dataset, we demonstrate the system's capability to process diverse human vocal recordings across multiple languages and noise backgrounds. The core of our… ▽ More This paper presents a novel approach to neuromorphic audio processing by integrating the strengths of Spiking Neural Networks (SNNs), Transformers, and high-performance computing (HPC) into the HPCNeuroNet architecture. Utilizing the Intel N-DNS dataset, we demonstrate the system's capability to process diverse human vocal recordings across multiple languages and noise backgrounds. The core of our approach lies in the fusion of the temporal dynamics of SNNs with the attention mechanisms of Transformers, enabling the model to capture intricate audio patterns and relationships. Our architecture, HPCNeuroNet, employs the Short-Time Fourier Transform (STFT) for time-frequency representation, Transformer embeddings for dense vector generation, and SNN encoding/decoding mechanisms for spike train conversions. The system's performance is further enhanced by leveraging the computational capabilities of NVIDIA's GeForce RTX 3060 GPU and Intel's Core i9 12900H CPU. Additionally, we introduce a hardware implementation on the Xilinx VU37P HBM FPGA platform, optimizing for energy efficiency and real-time processing. The proposed accelerator achieves a throughput of 71.11 Giga-Operations Per Second (GOP/s) with a 3.55 W on-chip power consumption at 100 MHz. The comparison results with off-the-shelf devices and recent state-of-the-art implementations illustrate that the proposed accelerator has obvious advantages in terms of energy efficiency and design flexibility. Through design-space exploration, we provide insights into optimizing core capacities for audio tasks. Our findings underscore the transformative potential of integrating SNNs, Transformers, and HPC for neuromorphic audio processing, setting a new benchmark for future research and applications. △ Less

Submitted 21 November, 2023; originally announced November 2023.

Comments: Submitted to IEEE Transactions on Signal Processing

arXiv:2311.12439 [pdf, other]

Harnessing FPGA Technology for Enhanced Biomedical Computation

Authors: Nisanur Alici, Kayode Inadagbo, Murat Isik

Abstract: This research delves into sophisticated neural network frameworks like Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Long Short-Term Memory Networks (LSTMs), and Deep Belief Networks (DBNs) for improved analysis of ECG signals via Field Programmable Gate Arrays (FPGAs). The MIT-BIH Arrhythmia Database serves as the foundation for training and evaluating our models, with add… ▽ More This research delves into sophisticated neural network frameworks like Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Long Short-Term Memory Networks (LSTMs), and Deep Belief Networks (DBNs) for improved analysis of ECG signals via Field Programmable Gate Arrays (FPGAs). The MIT-BIH Arrhythmia Database serves as the foundation for training and evaluating our models, with added Gaussian noise to heighten the algorithms' resilience. The developed architectures incorporate various layers for specific processing and categorization functions, employing strategies such as the EarlyStop** callback and Dropout layer to prevent overfitting. Additionally, this paper details the creation of a tailored Tensor Compute Unit (TCU) accelerator for the PYNQ Z1 platform. It provides a thorough methodology for implementing FPGA-based machine learning, encompassing the configuration of the Tensil toolchain in Docker, selection of architectures, PS-PL configuration, and the compilation and deployment of models. By evaluating performance indicators like latency and throughput, we showcase the efficacy of FPGAs in advanced biomedical computing. This study ultimately serves as a comprehensive guide to optimizing neural network operations on FPGAs across various fields. △ Less

Submitted 21 November, 2023; originally announced November 2023.

Comments: Submitted to IEEE Transactions on Biomedical Circuits and Systems. arXiv admin note: substantial text overlap with arXiv:2307.07914

arXiv:2310.09671 [pdf, other]

FPGA Implementation of OTFS Modulation for 6G Communication Systems

Authors: Murat Isik, Malvin Nkomo, Anup Das, Kapil R. Dandekar

Abstract: Sixth-generation (6G) communication systems are poised to accommodate high data-rate wireless communication services in highly dynamic channels, with applications including high-speed trains, unmanned aerial vehicles, and intelligent transportation systems. Orthogonal frequency-division multiplexing (OFDM) modulation suffers from performance degradation in such high-mobility applications due to hi… ▽ More Sixth-generation (6G) communication systems are poised to accommodate high data-rate wireless communication services in highly dynamic channels, with applications including high-speed trains, unmanned aerial vehicles, and intelligent transportation systems. Orthogonal frequency-division multiplexing (OFDM) modulation suffers from performance degradation in such high-mobility applications due to high Doppler spread in the channel. The recently proposed Orthogonal Time Frequency Space (OTFS) modulation scheme outperforms OFDM in terms of supporting a higher transmitter (Tx) and receiver (Rx) user velocity. Additionally, the highly-dynamic time-frequency (TF) channel has little effect on OTFS modulated signals, which enables the realization of low-complexity pre-processing architectures for implementing massive-multiple input multiple outputs (MIMO) based OTFS systems. However, while OTFS has received attention in the literature from a theory and simulation perspective, there has been comparatively little work on real-time FPGA implementation of OTFS waveforms. Thus, in this paper, we first present a mathematical overview of OTFS modulation and then describe an FPGA implementation of OTFS implementation on hardware. Power, area, and timing analysis of the implemented design on a Zynq UltraScale+ RFSoC FPGA are provided for benchmarking purposes. △ Less

Submitted 14 October, 2023; originally announced October 2023.

Comments: Accepted at IEEE Future Networks, 2023

arXiv:2309.08232 [pdf, other]

Astrocyte-Integrated Dynamic Function Exchange in Spiking Neural Networks

Authors: Murat Isik, Kayode Inadagbo

Abstract: This paper presents an innovative methodology for improving the robustness and computational efficiency of Spiking Neural Networks (SNNs), a critical component in neuromorphic computing. The proposed approach integrates astrocytes, a type of glial cell prevalent in the human brain, into SNNs, creating astrocyte-augmented networks. To achieve this, we designed and implemented an astrocyte model in… ▽ More This paper presents an innovative methodology for improving the robustness and computational efficiency of Spiking Neural Networks (SNNs), a critical component in neuromorphic computing. The proposed approach integrates astrocytes, a type of glial cell prevalent in the human brain, into SNNs, creating astrocyte-augmented networks. To achieve this, we designed and implemented an astrocyte model in two distinct platforms: CPU/GPU and FPGA. Our FPGA implementation notably utilizes Dynamic Function Exchange (DFX) technology, enabling real-time hardware reconfiguration and adaptive model creation based on current operating conditions. The novel approach of leveraging astrocytes significantly improves the fault tolerance of SNNs, thereby enhancing their robustness. Notably, our astrocyte-augmented SNN displays near-zero latency and theoretically infinite throughput, implying exceptional computational efficiency. Through comprehensive comparative analysis with prior works, it's established that our model surpasses others in terms of neuron and synapse count while maintaining an efficient power consumption profile. These results underscore the potential of our methodology in sha** the future of neuromorphic computing, by providing robust and energy-efficient systems. △ Less

Submitted 15 September, 2023; originally announced September 2023.

Comments: Accepted at 8th International Conference on Engineering of Computer-based Systems

arXiv:2307.07914 [pdf, other]

Exploiting FPGA Capabilities for Accelerated Biomedical Computing

Authors: Kayode Inadagbo, Baran Arig, Nisanur Alici, Murat Isik

Abstract: This study presents advanced neural network architectures including Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Long Short-Term Memory Networks (LSTMs), and Deep Belief Networks (DBNs) for enhanced ECG signal analysis using Field Programmable Gate Arrays (FPGAs). We utilize the MIT-BIH Arrhythmia Database for training and validation, introducing Gaussian noise to improve… ▽ More This study presents advanced neural network architectures including Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Long Short-Term Memory Networks (LSTMs), and Deep Belief Networks (DBNs) for enhanced ECG signal analysis using Field Programmable Gate Arrays (FPGAs). We utilize the MIT-BIH Arrhythmia Database for training and validation, introducing Gaussian noise to improve algorithm robustness. The implemented models feature various layers for distinct processing and classification tasks and techniques like EarlyStop** callback and Dropout layer are used to mitigate overfitting. Our work also explores the development of a custom Tensor Compute Unit (TCU) accelerator for the PYNQ Z1 board, offering comprehensive steps for FPGA-based machine learning, including setting up the Tensil toolchain in Docker, selecting architecture, configuring PS-PL, and compiling and executing models. Performance metrics such as latency and throughput are calculated for practical insights, demonstrating the potential of FPGAs in high-performance biomedical computing. The study ultimately offers a guide for optimizing neural network performance on FPGAs for various applications. △ Less

Submitted 15 July, 2023; originally announced July 2023.

Comments: Accepted at IEEE Signal Processing Algorithms, Architectures, Arrangements and Applications (SPA), 2023

arXiv:2307.03910 [pdf, other]

A Survey of Spiking Neural Network Accelerator on FPGA

Authors: Murat Isik

Abstract: Due to the ability to implement customized topology, FPGA is increasingly used to deploy SNNs in both embedded and high-performance applications. In this paper, we survey state-of-the-art SNN implementations and their applications on FPGA. We collect the recent widely-used spiking neuron models, network structures, and signal encoding formats, followed by the enumeration of related hardware design… ▽ More Due to the ability to implement customized topology, FPGA is increasingly used to deploy SNNs in both embedded and high-performance applications. In this paper, we survey state-of-the-art SNN implementations and their applications on FPGA. We collect the recent widely-used spiking neuron models, network structures, and signal encoding formats, followed by the enumeration of related hardware design schemes for FPGA-based SNN implementations. Compared with the previous surveys, this manuscript enumerates the application instances that applied the above-mentioned technical schemes in recent research. Based on that, we discuss the actual acceleration potential of implementing SNN on FPGA. According to our above discussion, the upcoming trends are discussed in this paper and give a guideline for further advancement in related subjects. △ Less

Submitted 8 July, 2023; originally announced July 2023.

arXiv:2305.06356 [pdf, other]

doi 10.1145/3592415

HumanRF: High-Fidelity Neural Radiance Fields for Humans in Motion

Authors: Mustafa Işık, Martin Rünz, Markos Georgopoulos, Taras Khakhulin, Jonathan Starck, Lourdes Agapito, Matthias Nießner

Abstract: Representing human performance at high-fidelity is an essential building block in diverse applications, such as film production, computer games or videoconferencing. To close the gap to production-level quality, we introduce HumanRF, a 4D dynamic neural scene representation that captures full-body appearance in motion from multi-view video input, and enables playback from novel, unseen viewpoints.… ▽ More Representing human performance at high-fidelity is an essential building block in diverse applications, such as film production, computer games or videoconferencing. To close the gap to production-level quality, we introduce HumanRF, a 4D dynamic neural scene representation that captures full-body appearance in motion from multi-view video input, and enables playback from novel, unseen viewpoints. Our novel representation acts as a dynamic video encoding that captures fine details at high compression rates by factorizing space-time into a temporal matrix-vector decomposition. This allows us to obtain temporally coherent reconstructions of human actors for long sequences, while representing high-resolution details even in the context of challenging motion. While most research focuses on synthesizing at resolutions of 4MP or lower, we address the challenge of operating at 12MP. To this end, we introduce ActorsHQ, a novel multi-view dataset that provides 12MP footage from 160 cameras for 16 sequences with high-fidelity, per-frame mesh reconstructions. We demonstrate challenges that emerge from using such high-resolution data and show that our newly introduced HumanRF effectively leverages this data, making a significant step towards production-level quality novel view synthesis. △ Less

Submitted 11 May, 2023; v1 submitted 10 May, 2023; originally announced May 2023.

Comments: Project webpage: https://synthesiaresearch.github.io/humanrf Dataset webpage: https://www.actors-hq.com/ Video: https://www.youtube.com/watch?v=OTnhiLLE7io Code: https://github.com/synthesiaresearch/humanrf

arXiv:2304.12474 [pdf, other]

Design optimization for high-performance computing using FPGA

Authors: Murat Isik, Kayode Inadagbo, Hakan Aktas

Abstract: Reconfigurable architectures like Field Programmable Gate Arrays (FPGAs) have been used for accelerating computations in several domains because of their unique combination of flexibility, performance, and power efficiency. However, FPGAs have not been widely used for high-performance computing, primarily because of their programming complexity and difficulties in optimizing performance. We optimi… ▽ More Reconfigurable architectures like Field Programmable Gate Arrays (FPGAs) have been used for accelerating computations in several domains because of their unique combination of flexibility, performance, and power efficiency. However, FPGAs have not been widely used for high-performance computing, primarily because of their programming complexity and difficulties in optimizing performance. We optimize Tensil AI's open-source inference accelerator for maximum performance using ResNet20 trained on CIFAR in this paper in order to gain insight into the use of FPGAs for high-performance computing. In this paper, we show how improving hardware design, using Xilinx Ultra RAM, and using advanced compiler strategies can lead to improved inference performance. We also demonstrate that running the CIFAR test data set shows very little accuracy drop when rounding down from the original 32-bit floating point. The heterogeneous computing model in our platform allows us to achieve a frame rate of 293.58 frames per second (FPS) and a %90 accuracy on a ResNet20 trained using CIFAR. The experimental results show that the proposed accelerator achieves a throughput of 21.12 Giga-Operations Per Second (GOP/s) with a 5.21 W on-chip power consumption at 100 MHz. The comparison results with off-the-shelf devices and recent state-of-the-art implementations illustrate that the proposed accelerator has obvious advantages in terms of energy efficiency. △ Less

Submitted 24 April, 2023; originally announced April 2023.

arXiv:2301.07050 [pdf, other]

An Energy-Efficient Reconfigurable Autoencoder Implementation on FPGA

Authors: Murat Isik, Matthew Oldland, Lifeng Zhou

Abstract: Autoencoders are unsupervised neural networks that are used to process and compress input data and then reconstruct the data back to the original data size. This allows autoencoders to be used for different processing applications such as data compression, image classification, image noise reduction, and image coloring. Hardware-wise, re-configurable architectures like Field Programmable Gate Arra… ▽ More Autoencoders are unsupervised neural networks that are used to process and compress input data and then reconstruct the data back to the original data size. This allows autoencoders to be used for different processing applications such as data compression, image classification, image noise reduction, and image coloring. Hardware-wise, re-configurable architectures like Field Programmable Gate Arrays (FPGAs) have been used for accelerating computations from several domains because of their unique combination of flexibility, performance, and power efficiency. In this paper, we look at the different autoencoders available and use the convolutional autoencoder in both FPGA and GPU-based implementations to process noisy static MNIST images. We compare the different results achieved with the FPGA and GPU-based implementations and then discuss the pros and cons of each implementation. The evaluation of the proposed design achieved 80%accuracy and our experimental results show that the proposed accelerator achieves a throughput of 21.12 Giga-Operations Per Second (GOP/s) with a 5.93 W on-chip power consumption at 100 MHz. The comparison results with off-the-shelf devices and recent state-of-the-art implementations illustrate that the proposed accelerator has obvious advantages in terms of energy efficiency and design flexibility. We also discuss future work that can be done with the use of our proposed accelerator. △ Less

Submitted 17 January, 2023; originally announced January 2023.

Comments: Accepted at Intelligent Systems Conference (IntelliSys) 2023

arXiv:2206.13056 [pdf, other]

A Coupled Neural Circuit Design for Guillain-Barre Syndrome

Authors: Oguzhan Derebasi, Murat Isik, Oguzhan Demirag, Dilek Goksel Duru, Anup Das

Abstract: Guillain-Barre syndrome is a rare neurological condition in which the human immune system attacks the peripheral nervous system. A peripheral nervous system appears as a diffusively connected system of mathematical models of neuron models, and the system's period becomes shorter than the periods of each neural circuit. The stimuli in the conduction path that will address the myelin sheath that has… ▽ More Guillain-Barre syndrome is a rare neurological condition in which the human immune system attacks the peripheral nervous system. A peripheral nervous system appears as a diffusively connected system of mathematical models of neuron models, and the system's period becomes shorter than the periods of each neural circuit. The stimuli in the conduction path that will address the myelin sheath that has lost its function are received by the axons and are conveyed externally to the target organ, aiming to solve the problem of decreased nerve conduction. In the NEURON simulation environment, one can create a neuron model and define biophysical events that take place within the system for study. In this environment, signal transmission between cells and dendrites is obtained graphically. The simulated potassium and sodium conductance are replicated adequately, and the electronic action potentials are quite comparable to those measured experimentally. In this work, we propose an analog and digital coupled neuron model comprising individual excitatory and inhibitory neural circuit blocks for a low-cost and energy-efficient system. Compared to digital design, our analog design performs in lower frequency but gives a 32.3\% decreased energy efficiency. Thus, the resulting coupled analog hardware neuron model can be a proposed model for the simulation of reduced nerve conduction. As a result, the analog coupled neuron, (even with its greater design complexity) serious contender for the future development of a wearable sensor device that could help with Guillain-Barre syndrome and other neurologic diseases. △ Less

Submitted 27 June, 2022; originally announced June 2022.

arXiv:2205.14177 [pdf, other]

Multiscale Voxel Based Decoding For Enhanced Natural Image Reconstruction From Brain Activity

Authors: Mali Halac, Murat Isik, Hasan Ayaz, Anup Das

Abstract: Reconstructing perceived images from human brain activity monitored by functional magnetic resonance imaging (fMRI) is hard, especially for natural images. Existing methods often result in blurry and unintelligible reconstructions with low fidelity. In this study, we present a novel approach for enhanced image reconstruction, in which existing methods for object decoding and image reconstruction a… ▽ More Reconstructing perceived images from human brain activity monitored by functional magnetic resonance imaging (fMRI) is hard, especially for natural images. Existing methods often result in blurry and unintelligible reconstructions with low fidelity. In this study, we present a novel approach for enhanced image reconstruction, in which existing methods for object decoding and image reconstruction are merged together. This is achieved by conditioning the reconstructed image to its decoded image category using a class-conditional generative adversarial network and neural style transfer. The results indicate that our approach improves the semantic similarity of the reconstructed images and can be used as a general framework for enhanced image reconstruction. △ Less

Submitted 27 May, 2022; originally announced May 2022.

Comments: Accepted at 2022 International Joint Conference on Neural Networks

arXiv:2204.02942 [pdf, other]

A Design Methodology for Fault-Tolerant Computing using Astrocyte Neural Networks

Authors: Murat Işık, Ankita Paul, M. Lakshmi Varshika, Anup Das

Abstract: We propose a design methodology to facilitate fault tolerance of deep learning models. First, we implement a many-core fault-tolerant neuromorphic hardware design, where neuron and synapse circuitries in each neuromorphic core are enclosed with astrocyte circuitries, the star-shaped glial cells of the brain that facilitate self-repair by restoring the spike firing frequency of a failed neuron usin… ▽ More We propose a design methodology to facilitate fault tolerance of deep learning models. First, we implement a many-core fault-tolerant neuromorphic hardware design, where neuron and synapse circuitries in each neuromorphic core are enclosed with astrocyte circuitries, the star-shaped glial cells of the brain that facilitate self-repair by restoring the spike firing frequency of a failed neuron using a closed-loop retrograde feedback signal. Next, we introduce astrocytes in a deep learning model to achieve the required degree of tolerance to hardware faults. Finally, we use a system software to partition the astrocyte-enabled model into clusters and implement them on the proposed fault-tolerant neuromorphic design. We evaluate this design methodology using seven deep learning inference models and show that it is both area and power efficient. △ Less

Submitted 6 April, 2022; originally announced April 2022.

Comments: Accepted at ACM Computing Frontiers, 2022

arXiv:2203.15092 [pdf, other]

Improved singing voice separation with chromagram-based pitch-aware remixing

Authors: Siyuan Yuan, Zhepei Wang, Umut Isik, Ritwik Giri, Jean-Marc Valin, Michael M. Goodwin, Arvindh Krishnaswamy

Abstract: Singing voice separation aims to separate music into vocals and accompaniment components. One of the major constraints for the task is the limited amount of training data with separated vocals. Data augmentation techniques such as random source mixing have been shown to make better use of existing data and mildly improve model performance. We propose a novel data augmentation technique, chromagram… ▽ More Singing voice separation aims to separate music into vocals and accompaniment components. One of the major constraints for the task is the limited amount of training data with separated vocals. Data augmentation techniques such as random source mixing have been shown to make better use of existing data and mildly improve model performance. We propose a novel data augmentation technique, chromagram-based pitch-aware remixing, where music segments with high pitch alignment are mixed. By performing controlled experiments in both supervised and semi-supervised settings, we demonstrate that training models with pitch-aware remixing significantly improves the test signal-to-distortion ratio (SDR) △ Less

Submitted 28 March, 2022; originally announced March 2022.

Comments: To appear at ICASSP 2022, 5 pages, 1 figure

arXiv:2202.08897 [pdf, other]

Implementing Spiking Neural Networks on Neuromorphic Architectures: A Review

Authors: Phu Khanh Huynh, M. Lakshmi Varshika, Ankita Paul, Murat Isik, Adarsha Balaji, Anup Das

Abstract: Recently, both industry and academia have proposed several different neuromorphic systems to execute machine learning applications that are designed using Spiking Neural Networks (SNNs). With the growing complexity on design and technology fronts, programming such systems to admit and execute a machine learning application is becoming increasingly challenging. Additionally, neuromorphic systems ar… ▽ More Recently, both industry and academia have proposed several different neuromorphic systems to execute machine learning applications that are designed using Spiking Neural Networks (SNNs). With the growing complexity on design and technology fronts, programming such systems to admit and execute a machine learning application is becoming increasingly challenging. Additionally, neuromorphic systems are required to guarantee real-time performance, consume lower energy, and provide tolerance to logic and memory failures. Consequently, there is a clear need for system software frameworks that can implement machine learning applications on current and emerging neuromorphic systems, and simultaneously address performance, energy, and reliability. Here, we provide a comprehensive overview of such frameworks proposed for both, platform-based design and hardware-software co-design. We highlight challenges and opportunities that the future holds in the area of system software technology for neuromorphic computing. △ Less

Submitted 17 February, 2022; originally announced February 2022.

arXiv:2102.06610 [pdf, other]

Enhancing into the codec: Noise Robust Speech Coding with Vector-Quantized Autoencoders

Authors: Jonah Casebeer, Vinjai Vale, Umut Isik, Jean-Marc Valin, Ritwik Giri, Arvindh Krishnaswamy

Abstract: Audio codecs based on discretized neural autoencoders have recently been developed and shown to provide significantly higher compression levels for comparable quality speech output. However, these models are tightly coupled with speech content, and produce unintended outputs in noisy conditions. Based on VQ-VAE autoencoders with WaveRNN decoders, we develop compressor-enhancer encoders and accompa… ▽ More Audio codecs based on discretized neural autoencoders have recently been developed and shown to provide significantly higher compression levels for comparable quality speech output. However, these models are tightly coupled with speech content, and produce unintended outputs in noisy conditions. Based on VQ-VAE autoencoders with WaveRNN decoders, we develop compressor-enhancer encoders and accompanying decoders, and show that they operate well in noisy conditions. We also observe that a compressor-enhancer model performs better on clean speech inputs than a compressor model trained only on clean speech. △ Less

Submitted 12 February, 2021; originally announced February 2021.

Comments: 5 pages, 2 figures, ICASSP 2021

arXiv:2008.08927 [pdf, other]

Generating Music with a Self-Correcting Non-Chronological Autoregressive Model

Authors: Wayne Chi, Prachi Kumar, Suri Yaddanapudi, Rahul Suresh, Umut Isik

Abstract: We describe a novel approach for generating music using a self-correcting, non-chronological, autoregressive model. We represent music as a sequence of edit events, each of which denotes either the addition or removal of a note---even a note previously generated by the model. During inference, we generate one edit event at a time using direct ancestral sampling. Our approach allows the model to fi… ▽ More We describe a novel approach for generating music using a self-correcting, non-chronological, autoregressive model. We represent music as a sequence of edit events, each of which denotes either the addition or removal of a note---even a note previously generated by the model. During inference, we generate one edit event at a time using direct ancestral sampling. Our approach allows the model to fix previous mistakes such as incorrectly sampled notes and prevent accumulation of errors which autoregressive models are prone to have. Another benefit is a finer, note-by-note control during human and AI collaborative composition. We show through quantitative metrics and human survey evaluation that our approach generates better results than orderless NADE and Gibbs sampling approaches. △ Less

Submitted 18 August, 2020; originally announced August 2020.

Comments: 8 pages, 4 figures

arXiv:2008.04470 [pdf, other]

PoCoNet: Better Speech Enhancement with Frequency-Positional Embeddings, Semi-Supervised Conversational Data, and Biased Loss

Authors: Umut Isik, Ritwik Giri, Neerad Phansalkar, Jean-Marc Valin, Karim Helwani, Arvindh Krishnaswamy

Abstract: Neural network applications generally benefit from larger-sized models, but for current speech enhancement models, larger scale networks often suffer from decreased robustness to the variety of real-world use cases beyond what is encountered in training data. We introduce several innovations that lead to better large neural networks for speech enhancement. The novel PoCoNet architecture is a convo… ▽ More Neural network applications generally benefit from larger-sized models, but for current speech enhancement models, larger scale networks often suffer from decreased robustness to the variety of real-world use cases beyond what is encountered in training data. We introduce several innovations that lead to better large neural networks for speech enhancement. The novel PoCoNet architecture is a convolutional neural network that, with the use of frequency-positional embeddings, is able to more efficiently build frequency-dependent features in the early layers. A semi-supervised method helps increase the amount of conversational training data by pre-enhancing noisy datasets, improving performance on real recordings. A new loss function biased towards preserving speech quality helps the optimization better match human perceptual opinions on speech quality. Ablation experiments and objective and human opinion metrics show the benefits of the proposed improvements. △ Less

Submitted 10 August, 2020; originally announced August 2020.

Comments: 5 pages, 3 figures, INTERSPEECH 2020

arXiv:2008.04259 [pdf, other]

A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech

Authors: Jean-Marc Valin, Umut Isik, Neerad Phansalkar, Ritwik Giri, Karim Helwani, Arvindh Krishnaswamy

Abstract: Over the past few years, speech enhancement methods based on deep learning have greatly surpassed traditional methods based on spectral subtraction and spectral estimation. Many of these new techniques operate directly in the the short-time Fourier transform (STFT) domain, resulting in a high computational complexity. In this work, we propose PercepNet, an efficient approach that relies on human p… ▽ More Over the past few years, speech enhancement methods based on deep learning have greatly surpassed traditional methods based on spectral subtraction and spectral estimation. Many of these new techniques operate directly in the the short-time Fourier transform (STFT) domain, resulting in a high computational complexity. In this work, we propose PercepNet, an efficient approach that relies on human perception of speech by focusing on the spectral envelope and on the periodicity of the speech. We demonstrate high-quality, real-time enhancement of fullband (48 kHz) speech with less than 5% of a CPU core. △ Less

Submitted 27 August, 2020; v1 submitted 10 August, 2020; originally announced August 2020.

Comments: Proc. INTERSPEECH 2020, 5 pages

arXiv:2007.10093 [pdf, other]

doi 10.1109/TVCG.2020.3039340

Learning Adaptive Sampling and Reconstruction for Volume Visualization

Authors: Sebastian Weiss, Mustafa Işık, Justus Thies, Rüdiger Westermann

Abstract: A central challenge in data visualization is to understand which data samples are required to generate an image of a data set in which the relevant information is encoded. In this work, we make a first step towards answering the question of whether an artificial neural network can predict where to sample the data with higher or lower density, by learning of correspondences between the data, the sa… ▽ More A central challenge in data visualization is to understand which data samples are required to generate an image of a data set in which the relevant information is encoded. In this work, we make a first step towards answering the question of whether an artificial neural network can predict where to sample the data with higher or lower density, by learning of correspondences between the data, the sampling patterns and the generated images. We introduce a novel neural rendering pipeline, which is trained end-to-end to generate a sparse adaptive sampling structure from a given low-resolution input image, and reconstructs a high-resolution image from the sparse set of samples. For the first time, to the best of our knowledge, we demonstrate that the selection of structures that are relevant for the final visual representation can be jointly learned together with the reconstruction of this representation from these structures. Therefore, we introduce differentiable sampling and reconstruction stages, which can leverage back-propagation based on supervised losses solely on the final image. We shed light on the adaptive sampling patterns generated by the network pipeline and analyze its use for volume visualization including isosurface and direct volume rendering. △ Less

Submitted 20 July, 2020; originally announced July 2020.

ACM Class: I.3.7

arXiv:2002.09286 [pdf, other]

Efficient Trainable Front-Ends for Neural Speech Enhancement

Authors: Jonah Casebeer, Umut Isik, Shrikant Venkataramani, Arvindh Krishnaswamy

Abstract: Many neural speech enhancement and source separation systems operate in the time-frequency domain. Such models often benefit from making their Short-Time Fourier Transform (STFT) front-ends trainable. In current literature, these are implemented as large Discrete Fourier Transform matrices; which are prohibitively inefficient for low-compute systems. We present an efficient, trainable front-end ba… ▽ More Many neural speech enhancement and source separation systems operate in the time-frequency domain. Such models often benefit from making their Short-Time Fourier Transform (STFT) front-ends trainable. In current literature, these are implemented as large Discrete Fourier Transform matrices; which are prohibitively inefficient for low-compute systems. We present an efficient, trainable front-end based on the butterfly mechanism to compute the Fast Fourier Transform, and show its accuracy and efficiency benefits for low-compute neural speech enhancement models. We also explore the effects of making the STFT window trainable. △ Less

Submitted 19 February, 2020; originally announced February 2020.

Comments: 5 pages, 5 figures, ICASSP 2020

arXiv:1610.07737 [pdf, other]

doi 10.1017/fms.2020.26

Categorical Complexity

Authors: Saugata Basu, M. Umut Isik

Abstract: We introduce a notion of complexity of diagrams (and in particular of objects and morphisms) in an arbitrary category, as well as a notion of complexity of functors between categories equipped with complexity functions. We discuss several examples of this new definition in categories of wide common interest, such as finite sets, Boolean functions, topological spaces, vector spaces, semi-linear and… ▽ More We introduce a notion of complexity of diagrams (and in particular of objects and morphisms) in an arbitrary category, as well as a notion of complexity of functors between categories equipped with complexity functions. We discuss several examples of this new definition in categories of wide common interest, such as finite sets, Boolean functions, topological spaces, vector spaces, semi-linear and semi-algebraic sets, graded algebras, affine and projective varieties and schemes, and modules over polynomial rings. We show that on one hand categorical complexity recovers in several settings classical notions of non-uniform computational complexity (such as circuit complexity), while on the other hand it has features which make it mathematically more natural. We also postulate that studying functor complexity is the categorical analog of classical questions in complexity theory about separating different complexity classes. △ Less

Submitted 10 December, 2019; v1 submitted 25 October, 2016; originally announced October 2016.

Comments: 47 pages

MSC Class: 18A15; 68Q15

Journal ref: Forum of Mathematics, Sigma 8 (2020) e34

arXiv:1609.02562 [pdf, ps, other]

Complexity Classes and Completeness in Algebraic Geometry

Authors: M. Umut Isik

Abstract: We study the computational complexity of sequences of projective varieties. We define analogues of the complexity classes P and NP for these and prove the NP-completeness of a sequence called the universal circuit resultant. This is the first family of compact spaces shown to be NP-complete in a geometric setting. We study the computational complexity of sequences of projective varieties. We define analogues of the complexity classes P and NP for these and prove the NP-completeness of a sequence called the universal circuit resultant. This is the first family of compact spaces shown to be NP-complete in a geometric setting. △ Less

Submitted 8 September, 2016; originally announced September 2016.

arXiv:1409.5568 [pdf, other]

On the Derived Categories of Degree d Hypersurface Fibrations

Authors: Matthew Ballard, Dragos Deliu, David Favero, M. Umut Isik, Ludmil Katzarkov

Abstract: We provide descriptions of the derived categories of degree $d$ hypersurface fibrations which generalize a result of Kuznetsov for quadric fibrations and give a relative version of a well-known theorem of Orlov. Using a local generator and Morita theory, we re-interpret the resulting matrix factorization category as a derived-equivalent sheaf of dg-algebras on the base. Then, applying homological… ▽ More We provide descriptions of the derived categories of degree $d$ hypersurface fibrations which generalize a result of Kuznetsov for quadric fibrations and give a relative version of a well-known theorem of Orlov. Using a local generator and Morita theory, we re-interpret the resulting matrix factorization category as a derived-equivalent sheaf of dg-algebras on the base. Then, applying homological perturbation methods, we obtain a sheaf of $A_\infty$-algebras which gives a new description of homological projective duals for (relative) $d$-Veronese embeddings, recovering the sheaf of Clifford algebras obtained by Kuznetsov in the case when $d=2$. △ Less

Submitted 19 September, 2014; originally announced September 2014.

Comments: 30 pages, expanded from arXiv:1306.3957, submitted

arXiv:1306.3957 [pdf, ps, other]

Homological Projective Duality via Variation of Geometric Invariant Theory Quotients

Authors: Matthew Ballard, Dragos Deliu, David Favero, M. Umut Isik, Ludmil Katzarkov

Abstract: We provide a geometric approach to constructing Lefschetz collections and Landau-Ginzburg Homological Projective Duals from a variation of Geometric Invariant Theory quotients. This approach yields homological projective duals for Veronese embeddings in the setting of Landau Ginzburg models. Our results also extend to a relative Homological Projective Duality framework. We provide a geometric approach to constructing Lefschetz collections and Landau-Ginzburg Homological Projective Duals from a variation of Geometric Invariant Theory quotients. This approach yields homological projective duals for Veronese embeddings in the setting of Landau Ginzburg models. Our results also extend to a relative Homological Projective Duality framework. △ Less

Submitted 19 September, 2014; v1 submitted 17 June, 2013; originally announced June 2013.

Comments: 32 pages, expanded the original into two parts, accepted for publication in JEMS

arXiv:1212.3264 [pdf, ps, other]

Resolutions in factorization categories

Authors: Matthew Ballard, Dragos Deliu, David Favero, M. Umut Isik, Ludmil Katzarkov

Abstract: Generalizing Eisenbud's matrix factorizations, we define factorization categories. Following work of Positselski, we define their associated derived categories. We construct specific resolutions of factorizations built from a choice of resolutions of their components. We use these resolutions to lift fully-faithfulness statements from derived categories of Abelian categories to derived categories… ▽ More Generalizing Eisenbud's matrix factorizations, we define factorization categories. Following work of Positselski, we define their associated derived categories. We construct specific resolutions of factorizations built from a choice of resolutions of their components. We use these resolutions to lift fully-faithfulness statements from derived categories of Abelian categories to derived categories of factorizations and to construct a spectral sequence computing the morphism spaces in the derived categories of factorizations from Ext-groups of their components in the underlying Abelian category. △ Less

Submitted 12 May, 2014; v1 submitted 13 December, 2012; originally announced December 2012.

Comments: v2: Expanded further for enjoyment. 41 pages. v1: 24 pages, expanded from a section in arXiv:1203.6643, comments welcome

arXiv:1204.5001 [pdf, ps, other]

Improving the Entropy Estimate of Neuronal Firings of Modeled Cochlear Nucleus Neurons

Authors: Andrea Grigorescu, Marek Rudnicki, Michael Isik, Werner Hemmert, Stefano Rini

Abstract: In this correspondence information theoretical tools are used to investigate the statistical properties of modeled cochlear nucleus globular bushy cell spike trains. The firing patterns are obtained from a simulation software that generates sample spike trains from any auditory input. Here we analyze for the first time the responses of globular bushy cells to voiced and unvoiced speech sounds. Cla… ▽ More In this correspondence information theoretical tools are used to investigate the statistical properties of modeled cochlear nucleus globular bushy cell spike trains. The firing patterns are obtained from a simulation software that generates sample spike trains from any auditory input. Here we analyze for the first time the responses of globular bushy cells to voiced and unvoiced speech sounds. Classical entropy estimates, such as the direct method, are improved upon by considering a time-varying and time-dependent entropy estimate. With this method we investigated the relationship between the predictability of the neuronal response and the frequency content in the auditory signals. The analysis quantifies the temporal precision of the neuronal coding and the memory in the neuronal response. △ Less

Submitted 24 April, 2012; v1 submitted 23 April, 2012; originally announced April 2012.

arXiv:1011.1484 [pdf, ps, other]

Equivalence of the Derived Category of a Variety with a Singularity Category

Authors: M. Umut Isik

Abstract: We prove an equivalence between the derived category of a variety and the equivariant/graded singularity category of a corresponding singular variety. The equivalence also holds at the dg level. We prove an equivalence between the derived category of a variety and the equivariant/graded singularity category of a corresponding singular variety. The equivalence also holds at the dg level. △ Less

Submitted 5 November, 2010; originally announced November 2010.

Comments: 16 pages

Showing 1–30 of 30 results for author: Isik, M