Search | arXiv e-print repository

Benchmarking Out-of-Distribution Generalization Capabilities of DNN-based Encoding Models for the Ventral Visual Cortex

Authors: Spandan Madan, Will Xiao, Mingran Cao, Hanspeter Pfister, Margaret Livingstone, Gabriel Kreiman

Abstract: We characterized the generalization capabilities of DNN-based encoding models when predicting neuronal responses from the visual cortex. We collected \textit{MacaqueITBench}, a large-scale dataset of neural population responses from the macaque inferior temporal (IT) cortex to over $300,000$ images, comprising $8,233$ unique natural images presented to seven monkeys over $109$ sessions. Using \tex… ▽ More We characterized the generalization capabilities of DNN-based encoding models when predicting neuronal responses from the visual cortex. We collected \textit{MacaqueITBench}, a large-scale dataset of neural population responses from the macaque inferior temporal (IT) cortex to over $300,000$ images, comprising $8,233$ unique natural images presented to seven monkeys over $109$ sessions. Using \textit{MacaqueITBench}, we investigated the impact of distribution shifts on models predicting neural activity by dividing the images into Out-Of-Distribution (OOD) train and test splits. The OOD splits included several different image-computable types including image contrast, hue, intensity, temperature, and saturation. Compared to the performance on in-distribution test images -- the conventional way these models have been evaluated -- models performed worse at predicting neuronal responses to out-of-distribution images, retaining as little as $20\%$ of the performance on in-distribution test images. The generalization performance under OOD shifts can be well accounted by a simple image similarity metric -- the cosine distance between image representations extracted from a pre-trained object recognition model is a strong predictor of neural predictivity under different distribution shifts. The dataset of images, neuronal firing rate recordings, and computational benchmarks are hosted publicly at: https://bit.ly/3zeutVd. △ Less

Submitted 16 June, 2024; originally announced June 2024.

arXiv:2404.14435 [pdf, other]

FreSeg: Frenet-Frame-based Part Segmentation for 3D Curvilinear Structures

Authors: Shixuan Gu, Jason Ken Adhinarta, Mikhail Bessmeltsev, Jiancheng Yang, Jessica Zhang, Daniel Berger, Jeff W. Lichtman, Hanspeter Pfister, Donglai Wei

Abstract: Part segmentation is a crucial task for 3D curvilinear structures like neuron dendrites and blood vessels, enabling the analysis of dendritic spines and aneurysms with scientific and clinical significance. However, their diversely winded morphology poses a generalization challenge to existing deep learning methods, which leads to labor-intensive manual correction. In this work, we propose FreSeg,… ▽ More Part segmentation is a crucial task for 3D curvilinear structures like neuron dendrites and blood vessels, enabling the analysis of dendritic spines and aneurysms with scientific and clinical significance. However, their diversely winded morphology poses a generalization challenge to existing deep learning methods, which leads to labor-intensive manual correction. In this work, we propose FreSeg, a framework of part segmentation tasks for 3D curvilinear structures. With Frenet-Frame-based point cloud transformation, it enables the models to learn more generalizable features and have significant performance improvements on tasks involving elongated and curvy geometries. We evaluate FreSeg on 2 datasets: 1) DenSpineEM, an in-house dataset for dendritic spine segmentation, and 2) IntrA, a public 3D dataset for intracranial aneurysm segmentation. Further, we will release the DenSpineEM dataset, which includes roughly 6,000 spines from 69 dendrites from 3 public electron microscopy (EM) datasets, to foster the development of effective dendritic spine instance extraction methods and, consequently, large-scale connectivity analysis to better understand mammalian brains. △ Less

Submitted 19 April, 2024; originally announced April 2024.

Comments: 10 pages, 4 figures

arXiv:2402.09372 [pdf, other]

Deep Rib Fracture Instance Segmentation and Classification from CT on the RibFrac Challenge

Authors: Jiancheng Yang, Rui Shi, Liang **, Xiaoyang Huang, Kaiming Kuang, Donglai Wei, Shixuan Gu, Jianying Liu, Pengfei Liu, Zhizhong Chai, Yongjie Xiao, Hao Chen, Liming Xu, Bang Du, Xiangyi Yan, Hao Tang, Adam Alessio, Gregory Holste, Jiapeng Zhang, Xiaoming Wang, Jianye He, Lixuan Che, Hanspeter Pfister, Ming Li, Bingbing Ni

Abstract: Rib fractures are a common and potentially severe injury that can be challenging and labor-intensive to detect in CT scans. While there have been efforts to address this field, the lack of large-scale annotated datasets and evaluation benchmarks has hindered the development and validation of deep learning algorithms. To address this issue, the RibFrac Challenge was introduced, providing a benchmar… ▽ More Rib fractures are a common and potentially severe injury that can be challenging and labor-intensive to detect in CT scans. While there have been efforts to address this field, the lack of large-scale annotated datasets and evaluation benchmarks has hindered the development and validation of deep learning algorithms. To address this issue, the RibFrac Challenge was introduced, providing a benchmark dataset of over 5,000 rib fractures from 660 CT scans, with voxel-level instance mask annotations and diagnosis labels for four clinical categories (buckle, nondisplaced, displaced, or segmental). The challenge includes two tracks: a detection (instance segmentation) track evaluated by an FROC-style metric and a classification track evaluated by an F1-style metric. During the MICCAI 2020 challenge period, 243 results were evaluated, and seven teams were invited to participate in the challenge summary. The analysis revealed that several top rib fracture detection solutions achieved performance comparable or even better than human experts. Nevertheless, the current rib fracture classification solutions are hardly clinically applicable, which can be an interesting area in the future. As an active benchmark and research resource, the data and online evaluation of the RibFrac Challenge are available at the challenge website. As an independent contribution, we have also extended our previous internal baseline by incorporating recent advancements in large-scale pretrained networks and point-based rib segmentation techniques. The resulting FracNet+ demonstrates competitive performance in rib fracture detection, which lays a foundation for further research and development in AI-assisted rib fracture detection and diagnosis. △ Less

Submitted 14 February, 2024; originally announced February 2024.

Comments: Challenge paper for MICCAI RibFrac Challenge (https://ribfrac.grand-challenge.org/)

arXiv:2309.10724 [pdf, other]

Sound Source Localization is All about Cross-Modal Alignment

Authors: Arda Senocak, Hyeonggon Ryu, Junsik Kim, Tae-Hyun Oh, Hanspeter Pfister, Joon Son Chung

Abstract: Humans can easily perceive the direction of sound sources in a visual scene, termed sound source localization. Recent studies on learning-based sound source localization have mainly explored the problem from a localization perspective. However, prior arts and existing benchmarks do not account for a more important aspect of the problem, cross-modal semantic understanding, which is essential for ge… ▽ More Humans can easily perceive the direction of sound sources in a visual scene, termed sound source localization. Recent studies on learning-based sound source localization have mainly explored the problem from a localization perspective. However, prior arts and existing benchmarks do not account for a more important aspect of the problem, cross-modal semantic understanding, which is essential for genuine sound source localization. Cross-modal semantic understanding is important in understanding semantically mismatched audio-visual events, e.g., silent objects, or off-screen sounds. To account for this, we propose a cross-modal alignment task as a joint task with sound source localization to better learn the interaction between audio and visual modalities. Thereby, we achieve high localization performance with strong cross-modal semantic understanding. Our method outperforms the state-of-the-art approaches in both sound source localization and cross-modal retrieval. Our work suggests that jointly tackling both tasks is necessary to conquer genuine sound source localization. △ Less

Submitted 19 September, 2023; originally announced September 2023.

Comments: ICCV 2023

arXiv:2212.10431 [pdf, other]

QuantArt: Quantizing Image Style Transfer Towards High Visual Fidelity

Authors: Siyu Huang, Jie An, Donglai Wei, Jiebo Luo, Hanspeter Pfister

Abstract: The mechanism of existing style transfer algorithms is by minimizing a hybrid loss function to push the generated image toward high similarities in both content and style. However, this type of approach cannot guarantee visual fidelity, i.e., the generated artworks should be indistinguishable from real ones. In this paper, we devise a new style transfer framework called QuantArt for high visual-fi… ▽ More The mechanism of existing style transfer algorithms is by minimizing a hybrid loss function to push the generated image toward high similarities in both content and style. However, this type of approach cannot guarantee visual fidelity, i.e., the generated artworks should be indistinguishable from real ones. In this paper, we devise a new style transfer framework called QuantArt for high visual-fidelity stylization. QuantArt pushes the latent representation of the generated artwork toward the centroids of the real artwork distribution with vector quantization. By fusing the quantized and continuous latent representations, QuantArt allows flexible control over the generated artworks in terms of content preservation, style similarity, and visual fidelity. Experiments on various style transfer settings show that our QuantArt framework achieves significantly higher visual fidelity compared with the existing style transfer methods. △ Less

Submitted 5 June, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

Comments: Accepted to CVPR 2023. Code is available at https://github.com/siyuhuang/QuantArt

arXiv:2210.09309 [pdf, other]

RibSeg v2: A Large-scale Benchmark for Rib Labeling and Anatomical Centerline Extraction

Authors: Liang **, Shixuan Gu, Donglai Wei, Jason Ken Adhinarta, Kaiming Kuang, Yongjie Jessica Zhang, Hanspeter Pfister, Bingbing Ni, Jiancheng Yang, Ming Li

Abstract: Automatic rib labeling and anatomical centerline extraction are common prerequisites for various clinical applications. Prior studies either use in-house datasets that are inaccessible to communities, or focus on rib segmentation that neglects the clinical significance of rib labeling. To address these issues, we extend our prior dataset (RibSeg) on the binary rib segmentation task to a comprehens… ▽ More Automatic rib labeling and anatomical centerline extraction are common prerequisites for various clinical applications. Prior studies either use in-house datasets that are inaccessible to communities, or focus on rib segmentation that neglects the clinical significance of rib labeling. To address these issues, we extend our prior dataset (RibSeg) on the binary rib segmentation task to a comprehensive benchmark, named RibSeg v2, with 660 CT scans (15,466 individual ribs in total) and annotations manually inspected by experts for rib labeling and anatomical centerline extraction. Based on the RibSeg v2, we develop a pipeline including deep learning-based methods for rib labeling, and a skeletonization-based method for centerline extraction. To improve computational efficiency, we propose a sparse point cloud representation of CT scans and compare it with standard dense voxel grids. Moreover, we design and analyze evaluation metrics to address the key challenges of each task. Our dataset, code, and model are available online to facilitate open research at https://github.com/M3DV/RibSeg △ Less

Submitted 1 August, 2023; v1 submitted 17 October, 2022; originally announced October 2022.

Comments: 10 pages, 6 figures, journal

arXiv:2206.07150 [pdf, other]

Attacks on Perception-Based Control Systems: Modeling and Fundamental Limits

Authors: Amir Khazraei, Henry Pfister, Miroslav Pajic

Abstract: We study the performance of perception-based control systems in the presence of attacks, and provide methods for modeling and analysis of their resiliency to stealthy attacks on both physical and perception-based sensing. Specifically, we consider a general setup with a nonlinear affine physical plant controlled with a perception-based controller that maps both the physical (e.g., IMUs) and percep… ▽ More We study the performance of perception-based control systems in the presence of attacks, and provide methods for modeling and analysis of their resiliency to stealthy attacks on both physical and perception-based sensing. Specifically, we consider a general setup with a nonlinear affine physical plant controlled with a perception-based controller that maps both the physical (e.g., IMUs) and perceptual (e.g., camera) sensing to the control input; the system is also equipped with a statistical or learning-based anomaly detector (AD). We model the attacks in the most general form, and introduce the notions of attack effectiveness and stealthiness independent of the used AD. In such setting, we consider attacks with different levels of runtime knowledge about the plant. We find sufficient conditions for existence of stealthy effective attacks that force the plant into an unsafe region without being detected by any AD. We show that as the open-loop unstable plant dynamics diverges faster and the closed-loop system converges faster to an equilibrium point, the system is more vulnerable to effective stealthy attacks. Also, depending on runtime information available to the attacker, the probability of attack remaining stealthy can be arbitrarily close to one, if the attacker's estimate of the plant's state is arbitrarily close to the true state; when an accurate estimate of the plant state is not available, the stealthiness level depends on the control performance in attack-free operation. △ Less

Submitted 27 August, 2023; v1 submitted 14 June, 2022; originally announced June 2022.

arXiv:2204.02844 [pdf, other]

Learning to Generate Realistic Noisy Images via Pixel-level Noise-aware Adversarial Training

Authors: Yuanhao Cai, Xiaowan Hu, Haoqian Wang, Yulun Zhang, Hanspeter Pfister, Donglai Wei

Abstract: Existing deep learning real denoising methods require a large amount of noisy-clean image pairs for supervision. Nonetheless, capturing a real noisy-clean dataset is an unacceptable expensive and cumbersome procedure. To alleviate this problem, this work investigates how to generate realistic noisy images. Firstly, we formulate a simple yet reasonable noise model that treats each real noisy pixel… ▽ More Existing deep learning real denoising methods require a large amount of noisy-clean image pairs for supervision. Nonetheless, capturing a real noisy-clean dataset is an unacceptable expensive and cumbersome procedure. To alleviate this problem, this work investigates how to generate realistic noisy images. Firstly, we formulate a simple yet reasonable noise model that treats each real noisy pixel as a random variable. This model splits the noisy image generation problem into two sub-problems: image domain alignment and noise domain alignment. Subsequently, we propose a novel framework, namely Pixel-level Noise-aware Generative Adversarial Network (PNGAN). PNGAN employs a pre-trained real denoiser to map the fake and real noisy images into a nearly noise-free solution space to perform image domain alignment. Simultaneously, PNGAN establishes a pixel-level adversarial training to conduct noise domain alignment. Additionally, for better noise fitting, we present an efficient architecture Simple Multi-scale Network (SMNet) as the generator. Qualitative validation shows that noise generated by PNGAN is highly similar to real noise in terms of intensity and distribution. Quantitative experiments demonstrate that a series of denoisers trained with the generated noisy images achieve state-of-the-art (SOTA) results on four real denoising benchmarks. Part of codes, pre-trained models, and results are available at https://github.com/caiyuanhao1998/PNGAN for comparisons. △ Less

Submitted 14 September, 2022; v1 submitted 6 April, 2022; originally announced April 2022.

Comments: NeurIPS 2021

arXiv:2112.05754 [pdf, other]

PyTorch Connectomics: A Scalable and Flexible Segmentation Framework for EM Connectomics

Authors: Zudi Lin, Donglai Wei, Jeff Lichtman, Hanspeter Pfister

Abstract: We present PyTorch Connectomics (PyTC), an open-source deep-learning framework for the semantic and instance segmentation of volumetric microscopy images, built upon PyTorch. We demonstrate the effectiveness of PyTC in the field of connectomics, which aims to segment and reconstruct neurons, synapses, and other organelles like mitochondria at nanometer resolution for understanding neuronal communi… ▽ More We present PyTorch Connectomics (PyTC), an open-source deep-learning framework for the semantic and instance segmentation of volumetric microscopy images, built upon PyTorch. We demonstrate the effectiveness of PyTC in the field of connectomics, which aims to segment and reconstruct neurons, synapses, and other organelles like mitochondria at nanometer resolution for understanding neuronal communication, metabolism, and development in animal brains. PyTC is a scalable and flexible toolbox that tackles datasets at different scales and supports multi-task and semi-supervised learning to better exploit expensive expert annotations and the vast amount of unlabeled data during training. Those functionalities can be easily realized in PyTC by changing the configuration options without coding and adapted to other 2D and 3D segmentation tasks for different tissues and imaging modalities. Quantitatively, our framework achieves the best performance in the CREMI challenge for synaptic cleft segmentation (outperforms existing best result by relatively 6.1$\%$) and competitive performance on mitochondria and neuronal nuclei segmentation. Code and tutorials are publicly available at https://connectomics.readthedocs.io. △ Less

Submitted 9 December, 2021; originally announced December 2021.

Comments: Technical report

arXiv:2110.14795 [pdf, other]

doi 10.1038/s41597-022-01721-8

MedMNIST v2 -- A large-scale lightweight benchmark for 2D and 3D biomedical image classification

Authors: Jiancheng Yang, Rui Shi, Donglai Wei, Zequan Liu, Lin Zhao, Bilian Ke, Hanspeter Pfister, Bingbing Ni

Abstract: We introduce MedMNIST v2, a large-scale MNIST-like dataset collection of standardized biomedical images, including 12 datasets for 2D and 6 datasets for 3D. All images are pre-processed into a small size of 28x28 (2D) or 28x28x28 (3D) with the corresponding classification labels so that no background knowledge is required for users. Covering primary data modalities in biomedical images, MedMNIST v… ▽ More We introduce MedMNIST v2, a large-scale MNIST-like dataset collection of standardized biomedical images, including 12 datasets for 2D and 6 datasets for 3D. All images are pre-processed into a small size of 28x28 (2D) or 28x28x28 (3D) with the corresponding classification labels so that no background knowledge is required for users. Covering primary data modalities in biomedical images, MedMNIST v2 is designed to perform classification on lightweight 2D and 3D images with various dataset scales (from 100 to 100,000) and diverse tasks (binary/multi-class, ordinal regression, and multi-label). The resulting dataset, consisting of 708,069 2D images and 10,214 3D images in total, could support numerous research / educational purposes in biomedical image analysis, computer vision, and machine learning. We benchmark several baseline methods on MedMNIST v2, including 2D / 3D neural networks and open-source / commercial AutoML tools. The data and code are publicly available at https://medmnist.com/. △ Less

Submitted 25 September, 2022; v1 submitted 27 October, 2021; originally announced October 2021.

Comments: The data and code are publicly available at https://medmnist.com/. arXiv admin note: text overlap with arXiv:2010.14925

Journal ref: Scientific Data 2023

arXiv:2109.09521 [pdf, other]

doi 10.1007/978-3-030-87193-2_58

RibSeg Dataset and Strong Point Cloud Baselines for Rib Segmentation from CT Scans

Authors: Jiancheng Yang, Shixuan Gu, Donglai Wei, Hanspeter Pfister, Bingbing Ni

Abstract: Manual rib inspections in computed tomography (CT) scans are clinically critical but labor-intensive, as 24 ribs are typically elongated and oblique in 3D volumes. Automatic rib segmentation methods can speed up the process through rib measurement and visualization. However, prior arts mostly use in-house labeled datasets that are publicly unavailable and work on dense 3D volumes that are computat… ▽ More Manual rib inspections in computed tomography (CT) scans are clinically critical but labor-intensive, as 24 ribs are typically elongated and oblique in 3D volumes. Automatic rib segmentation methods can speed up the process through rib measurement and visualization. However, prior arts mostly use in-house labeled datasets that are publicly unavailable and work on dense 3D volumes that are computationally inefficient. To address these issues, we develop a labeled rib segmentation benchmark, named \emph{RibSeg}, including 490 CT scans (11,719 individual ribs) from a public dataset. For ground truth generation, we used existing morphology-based algorithms and manually refined its results. Then, considering the sparsity of ribs in 3D volumes, we thresholded and sampled sparse voxels from the input and designed a point cloud-based baseline method for rib segmentation. The proposed method achieves state-of-the-art segmentation performance (Dice~$\approx95\%$) with significant efficiency ($10\sim40\times$ faster than prior arts). The RibSeg dataset, code, and model in PyTorch are available at https://github.com/M3DV/RibSeg. △ Less

Submitted 17 September, 2021; originally announced September 2021.

Comments: MICCAI 2021. The dataset, code, and model are available at https://github.com/M3DV/RibSeg

arXiv:2109.08684 [pdf, other]

doi 10.1007/978-3-030-87240-3_55

Asymmetric 3D Context Fusion for Universal Lesion Detection

Authors: Jiancheng Yang, Yi He, Kaiming Kuang, Zudi Lin, Hanspeter Pfister, Bingbing Ni

Abstract: Modeling 3D context is essential for high-performance 3D medical image analysis. Although 2D networks benefit from large-scale 2D supervised pretraining, it is weak in capturing 3D context. 3D networks are strong in 3D context yet lack supervised pretraining. As an emerging technique, \emph{3D context fusion operator}, which enables conversion from 2D pretrained networks, leverages the advantages… ▽ More Modeling 3D context is essential for high-performance 3D medical image analysis. Although 2D networks benefit from large-scale 2D supervised pretraining, it is weak in capturing 3D context. 3D networks are strong in 3D context yet lack supervised pretraining. As an emerging technique, \emph{3D context fusion operator}, which enables conversion from 2D pretrained networks, leverages the advantages of both and has achieved great success. Existing 3D context fusion operators are designed to be spatially symmetric, i.e., performing identical operations on each 2D slice like convolutions. However, these operators are not truly equivariant to translation, especially when only a few 3D slices are used as inputs. In this paper, we propose a novel asymmetric 3D context fusion operator (A3D), which uses different weights to fuse 3D context from different 2D slices. Notably, A3D is NOT translation-equivariant while it significantly outperforms existing symmetric context fusion operators without introducing large computational overhead. We validate the effectiveness of the proposed method by extensive experiments on DeepLesion benchmark, a large-scale public dataset for universal lesion detection from computed tomography (CT). The proposed A3D consistently outperforms symmetric context fusion operators by considerable margins, and establishes a new \emph{state of the art} on DeepLesion. To facilitate open research, our code and model in PyTorch are available at https://github.com/M3DV/AlignShift. △ Less

Submitted 17 September, 2021; originally announced September 2021.

Comments: MICCAI 2021. The code and model are available at https://github.com/M3DV/AlignShift

arXiv:2010.14258 [pdf, other]

Physics-Based Deep Learning for Fiber-Optic Communication Systems

Authors: Christian Häger, Henry D. Pfister

Abstract: We propose a new machine-learning approach for fiber-optic communication systems whose signal propagation is governed by the nonlinear Schrödinger equation (NLSE). Our main observation is that the popular split-step method (SSM) for numerically solving the NLSE has essentially the same functional form as a deep multi-layer neural network; in both cases, one alternates linear steps and pointwise no… ▽ More We propose a new machine-learning approach for fiber-optic communication systems whose signal propagation is governed by the nonlinear Schrödinger equation (NLSE). Our main observation is that the popular split-step method (SSM) for numerically solving the NLSE has essentially the same functional form as a deep multi-layer neural network; in both cases, one alternates linear steps and pointwise nonlinearities. We exploit this connection by parameterizing the SSM and viewing the linear steps as general linear functions, similar to the weight matrices in a neural network. The resulting physics-based machine-learning model has several advantages over "black-box" function approximators. For example, it allows us to examine and interpret the learned solutions in order to understand why they perform well. As an application, low-complexity nonlinear equalization is considered, where the task is to efficiently invert the NLSE. This is commonly referred to as digital backpropagation (DBP). Rather than employing neural networks, the proposed algorithm, dubbed learned DBP (LDBP), uses the physics-based model with trainable filters in each step and its complexity is reduced by progressively pruning filter taps during gradient descent. Our main finding is that the filters can be pruned to remarkably short lengths-as few as 3 taps/step-without sacrificing performance. As a result, the complexity can be reduced by orders of magnitude in comparison to prior work. By inspecting the filter responses, an additional theoretical justification for the learned parameter configurations is provided. Our work illustrates that combining data-driven optimization with existing domain knowledge can generate new insights into old communications problems. △ Less

Submitted 27 October, 2020; originally announced October 2020.

Comments: 15 pages, 11 figures, submitted to IEEE J. Sel. Areas Commun., code available at https://github.com/chaeger/LDBP, extension of arXiv:1710.06234(1), arXiv:1804.02799(1), arXiv:1901.07592(2)

arXiv:2010.12313 [pdf, other]

doi 10.1109/JLT.2020.3034047

Model-Based Machine Learning for Joint Digital Backpropagation and PMD Compensation

Authors: Rick M. Bütler, Christian Häger, Henry D. Pfister, Gabriele Liga, Alex Alvarado

Abstract: In this paper, we propose a model-based machine-learning approach for dual-polarization systems by parameterizing the split-step Fourier method for the Manakov-PMD equation. The resulting method combines hardware-friendly time-domain nonlinearity mitigation via the recently proposed learned digital backpropagation (LDBP) with distributed compensation of polarization-mode dispersion (PMD). We refer… ▽ More In this paper, we propose a model-based machine-learning approach for dual-polarization systems by parameterizing the split-step Fourier method for the Manakov-PMD equation. The resulting method combines hardware-friendly time-domain nonlinearity mitigation via the recently proposed learned digital backpropagation (LDBP) with distributed compensation of polarization-mode dispersion (PMD). We refer to the resulting approach as LDBP-PMD. We train LDBP-PMD on multiple PMD realizations and show that it converges within 1% of its peak dB performance after 428 training iterations on average, yielding a peak effective signal-to-noise ratio of only 0.30 dB below the PMD-free case. Similar to state-of-the-art lumped PMD compensation algorithms in practical systems, our approach does not assume any knowledge about the particular PMD realization along the link, nor any knowledge about the total accumulated PMD. This is a significant improvement compared to prior work on distributed PMD compensation, where knowledge about the accumulated PMD is typically assumed. We also compare different parameterization choices in terms of performance, complexity, and convergence behavior. Lastly, we demonstrate that the learned models can be successfully retrained after an abrupt change of the PMD realization along the fiber. △ Less

Submitted 23 October, 2020; originally announced October 2020.

Comments: 10 pages, 11 figures, to appear in the IEEE/OSA Journal of Lightwave Technology

arXiv:2007.10025 [pdf, other]

doi 10.1109/JLT.2020.2994220

Revisiting Efficient Multi-Step Nonlinearity Compensation with Machine Learning: An Experimental Demonstration

Authors: Vinícius Oliari, Sebastiaan Goossens, Christian Häger, Gabriele Liga, Rick M. Bütler, Menno van den Hout, Sjoerd van der Heide, Henry D. Pfister, Chigo Okonkwo, Alex Alvarado

Abstract: Efficient nonlinearity compensation in fiber-optic communication systems is considered a key element to go beyond the "capacity crunch''. One guiding principle for previous work on the design of practical nonlinearity compensation schemes is that fewer steps lead to better systems. In this paper, we challenge this assumption and show how to carefully design multi-step approaches that provide bette… ▽ More Efficient nonlinearity compensation in fiber-optic communication systems is considered a key element to go beyond the "capacity crunch''. One guiding principle for previous work on the design of practical nonlinearity compensation schemes is that fewer steps lead to better systems. In this paper, we challenge this assumption and show how to carefully design multi-step approaches that provide better performance--complexity trade-offs than their few-step counterparts. We consider the recently proposed learned digital backpropagation (LDBP) approach, where the linear steps in the split-step method are re-interpreted as general linear functions, similar to the weight matrices in a deep neural network. Our main contribution lies in an experimental demonstration of this approach for a 25 Gbaud single-channel optical transmission system. It is shown how LDBP can be integrated into a coherent receiver DSP chain and successfully trained in the presence of various hardware impairments. Our results show that LDBP with limited complexity can achieve better performance than standard DBP by using very short, but jointly optimized, finite-impulse response filters in each step. This paper also provides an overview of recently proposed extensions of LDBP and we comment on potentially interesting avenues for future work. △ Less

Submitted 20 July, 2020; originally announced July 2020.

Comments: 10 pages, 5 figures. Author version of a paper published in the Journal of Lightwave Technology. OSA/IEEE copyright may apply

Journal ref: Journal of Lightwave Technology, vol. 38, no. 12, pp. 3114-3124, 15 June, 2020

arXiv:2001.09277 [pdf, other]

Model-Based Machine Learning for Joint Digital Backpropagation and PMD Compensation

Authors: Christian Häger, Henry D. Pfister, Rick M. Bütler, Gabriele Liga, Alex Alvarado

Abstract: We propose a model-based machine-learning approach for polarization-multiplexed systems by parameterizing the split-step method for the Manakov-PMD equation. This approach performs hardware-friendly DBP and distributed PMD compensation with performance close to the PMD-free case. We propose a model-based machine-learning approach for polarization-multiplexed systems by parameterizing the split-step method for the Manakov-PMD equation. This approach performs hardware-friendly DBP and distributed PMD compensation with performance close to the PMD-free case. △ Less

Submitted 25 January, 2020; originally announced January 2020.

Comments: 3 pages, 2 figures

arXiv:1904.09807 [pdf, other]

Revisiting Multi-Step Nonlinearity Compensation with Machine Learning

Authors: Christian Häger, Henry D. Pfister, Rick M. Bütler, Gabriele Liga, Alex Alvarado

Abstract: For the efficient compensation of fiber nonlinearity, one of the guiding principles appears to be: fewer steps are better and more efficient. We challenge this assumption and show that carefully designed multi-step approaches can lead to better performance-complexity trade-offs than their few-step counterparts. For the efficient compensation of fiber nonlinearity, one of the guiding principles appears to be: fewer steps are better and more efficient. We challenge this assumption and show that carefully designed multi-step approaches can lead to better performance-complexity trade-offs than their few-step counterparts. △ Less

Submitted 22 April, 2019; originally announced April 2019.

Comments: 4 pages, 3 figures, This is a preprint of a paper submitted to the 2019 European Conference on Optical Communication

Showing 1–17 of 17 results for author: Pfister, H