Search | arXiv e-print repository

Relaxed Multi-Tx DDM Online Calibration

Authors: Mayeul Jeannin, Oliver Lang, Farhan Bin Khalid, Dian Tresna Nugraha, Mario Huemer

Abstract: In multiple-input and multiple-output (MIMO) radar systems based on Doppler-division multiplexing (DDM), phase shifters are employed in the transmit paths and require calibration strategies to maintain optimal performance all along the radar system's life cycle. In this paper, we propose a novel family of DDM codes that enable an online calibration of the phase shifters that scale realistically to… ▽ More In multiple-input and multiple-output (MIMO) radar systems based on Doppler-division multiplexing (DDM), phase shifters are employed in the transmit paths and require calibration strategies to maintain optimal performance all along the radar system's life cycle. In this paper, we propose a novel family of DDM codes that enable an online calibration of the phase shifters that scale realistically to any number of simultaneously activated transmit (Tx)-channels during the calibration frames. To achieve this goal we employ the previously developed odd-DDM (ODDM) sequences to design calibration DDM codes with reduced inter-Tx leakage. The proposed calibration sequence is applied to an automotive radar data set modulated with erroneous phase shifters. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Comments: 5 pages, 6 figures, 2 tables, conference

arXiv:2311.09746 [pdf, ps, other]

OFDM-based Waveforms for Joint Sensing and Communications Robust to Frequency Selective IQ Imbalance

Authors: Oliver Lang, Christian Hofbauer, Moritz Tockner, Reinhard Feger, Thomas Wagner, Mario Huemer

Abstract: Orthogonal frequency-division multiplexing (OFDM) is a promising waveform candidate for future joint sensing and communication systems. It is well known that the OFDM waveform is vulnerable to in-phase and quadrature-phase (IQ) imbalance, which increases the noise floor in a range-Doppler map (RDM). A state-of-the-art method for robustifying the OFDM waveform against IQ imbalance avoids an increas… ▽ More Orthogonal frequency-division multiplexing (OFDM) is a promising waveform candidate for future joint sensing and communication systems. It is well known that the OFDM waveform is vulnerable to in-phase and quadrature-phase (IQ) imbalance, which increases the noise floor in a range-Doppler map (RDM). A state-of-the-art method for robustifying the OFDM waveform against IQ imbalance avoids an increased noise floor, but it generates additional ghost objects in the RDM [1]. A consequence of these additional ghost objects is a reduction of the maximum unambiguous range. In this work, a novel OFDM-based waveform robust to IQ imbalance is proposed, which neither increases the noise floor nor reduces the maximum unambiguous range. The latter is achieved by shifting the ghost objects in the RDM to different velocities such that their range variations observed over several consecutive RDMs do not correspond to the observed velocity. This allows tracking algorithms to identify them as ghost objects and eliminate them for the follow-up processing steps. Moreover, we propose complete communication systems for both the proposed waveform as well as for the state-of-the-art waveform, including methods for channel estimation, synchronization, and data estimation that are specifically designed to deal with frequency selective IQ imbalance which occurs in wideband systems. The effectiveness of these communication systems is demonstrated by means of bit error ratio (BER) simulations. △ Less

Submitted 16 November, 2023; originally announced November 2023.

arXiv:2309.02901 [pdf, other]

Bi-Linear Homogeneity Enforced Calibration for Pipelined ADCs

Authors: Matthias Wagner, Oliver Lang, Esmaeil Kavousi Ghafi, Andreas Schwarz, Mario Huemer

Abstract: Pipelined analog-to-digital converters (ADCs) are key enablers in many state-of-the-art signal processing systems with high sampling rates. In addition to high sampling rates, such systems often demand a high linearity. To meet these challenging linearity requirements, ADC calibration techniques were heavily investigated throughout the past decades. One limitation in ADC calibration is the need fo… ▽ More Pipelined analog-to-digital converters (ADCs) are key enablers in many state-of-the-art signal processing systems with high sampling rates. In addition to high sampling rates, such systems often demand a high linearity. To meet these challenging linearity requirements, ADC calibration techniques were heavily investigated throughout the past decades. One limitation in ADC calibration is the need for a precisely known test signal. In our previous work, we proposed the homogeneity enforced calibration (HEC) approach, which circumvents this need by consecutively feeding a test signal and a scaled version of it into the ADC. The calibration itself is performed using only the corresponding output samples, such that the test signal can remain unknown. On the downside, the HEC approach requires the option to accurately scale the test signal, impeding an on-chip implementation. In this work, we provide a thorough analysis of the HEC approach, including the effects of an inaccurately scaled test signal. Furthermore, the bi-linear homogeneity enforced calibration (BL-HEC) approach is introduced and suggested to account for an inaccurate scaling and, therefore, to facilitate an on-chip implementation. In addition, a comprehensive stability and convergence analysis of the BL-HEC approach is carried out. Finally, we verify our concept with simulations. △ Less

Submitted 6 September, 2023; originally announced September 2023.

Comments: 12 pages, 5 figures

arXiv:2308.12591 [pdf, other]

SICNN: Soft Interference Cancellation Inspired Neural Network Equalizers

Authors: Stefan Baumgartner, Oliver Lang, Mario Huemer

Abstract: In recent years data-driven machine learning approaches have been extensively studied to replace or enhance traditionally model-based processing in digital communication systems. In this work, we focus on equalization and propose a novel neural network (NN-)based approach, referred to as SICNN. SICNN is designed by deep unfolding a model-based iterative soft interference cancellation (SIC) method.… ▽ More In recent years data-driven machine learning approaches have been extensively studied to replace or enhance traditionally model-based processing in digital communication systems. In this work, we focus on equalization and propose a novel neural network (NN-)based approach, referred to as SICNN. SICNN is designed by deep unfolding a model-based iterative soft interference cancellation (SIC) method. It eliminates the main disadvantages of its model-based counterpart, which suffers from high computational complexity and performance degradation due to required approximations. We present different variants of SICNN. SICNNv1 is specifically tailored to single carrier frequency domain equalization (SC-FDE) systems, the communication system mainly regarded in this work. SICNNv2 is more universal and is applicable as an equalizer in any communication system with a block-based data transmission scheme. Moreover, for both SICNNv1 and SICNNv2, we present versions with highly reduced numbers of learnable parameters. Another contribution of this work is a novel approach for generating training datasets for NN-based equalizers, which significantly improves their performance at high signal-to-noise ratios. We compare the bit error ratio performance of the proposed NN-based equalizers with state-of-the-art model-based and NN-based approaches, highlighting the superiority of SICNNv1 over all other methods for SC-FDE. Exemplarily, to emphasize its universality, SICNNv2 is additionally applied to a unique word orthogonal frequency division multiplexing (UW-OFDM) system, where it achieves state-of-the-art performance. Furthermore, we present a thorough complexity analysis of the proposed NN-based equalization approaches, and we investigate the influence of the training set size on the performance of NN-based equalizers. △ Less

Submitted 11 March, 2024; v1 submitted 24 August, 2023; originally announced August 2023.

arXiv:2306.00985 [pdf]

doi 10.1016/j.ebiom.2024.105075

Using generative AI to investigate medical imagery models and datasets

Authors: Oran Lang, Doron Yaya-Stupp, Ilana Traynis, Heather Cole-Lewis, Chloe R. Bennett, Courtney Lyles, Charles Lau, Christopher Semturs, Dale R. Webster, Greg S. Corrado, Avinatan Hassidim, Yossi Matias, Yun Liu, Naama Hammel, Boris Babenko

Abstract: AI models have shown promise in many medical imaging tasks. However, our ability to explain what signals these models have learned is severely lacking. Explanations are needed in order to increase the trust in AI-based models, and could enable novel scientific discovery by uncovering signals in the data that are not yet known to experts. In this paper, we present a method for automatic visual expl… ▽ More AI models have shown promise in many medical imaging tasks. However, our ability to explain what signals these models have learned is severely lacking. Explanations are needed in order to increase the trust in AI-based models, and could enable novel scientific discovery by uncovering signals in the data that are not yet known to experts. In this paper, we present a method for automatic visual explanations leveraging team-based expertise by generating hypotheses of what visual signals in the images are correlated with the task. We propose the following 4 steps: (i) Train a classifier to perform a given task (ii) Train a classifier guided StyleGAN-based image generator (StylEx) (iii) Automatically detect and visualize the top visual attributes that the classifier is sensitive towards (iv) Formulate hypotheses for the underlying mechanisms, to stimulate future research. Specifically, we present the discovered attributes to an interdisciplinary panel of experts so that hypotheses can account for social and structural determinants of health. We demonstrate results on eight prediction tasks across three medical imaging modalities: retinal fundus photographs, external eye photographs, and chest radiographs. We showcase examples of attributes that capture clinically known features, confounders that arise from factors beyond physiological mechanisms, and reveal a number of physiologically plausible novel attributes. Our approach has the potential to enable researchers to better understand, improve their assessment, and extract new knowledge from AI-based models. Importantly, we highlight that attributes generated by our framework can capture phenomena beyond physiology or pathophysiology, reflecting the real world nature of healthcare delivery and socio-cultural factors. Finally, we intend to release code to enable researchers to train their own StylEx models and analyze their predictive tasks. △ Less

Submitted 1 June, 2023; originally announced June 2023.

Comments: 34 pages, 1 figure

arXiv:2303.12438 [pdf, ps, other]

Doppler-Division Multiplexing for MIMO OFDM Joint Sensing and Communications

Authors: Oliver Lang, Christian Hofbauer, Reinhard Feger, Mario Huemer

Abstract: A promising waveform candidate for future joint sensing and communication systems is orthogonal frequencydivision multiplexing (OFDM). For such systems, supporting multiple transmit antennas requires multiplexing methods for the generation of orthogonal transmit signals, where equidistant subcarrier interleaving (ESI) is the most popular multiplexing method. In this work, we analyze a multiplexing… ▽ More A promising waveform candidate for future joint sensing and communication systems is orthogonal frequencydivision multiplexing (OFDM). For such systems, supporting multiple transmit antennas requires multiplexing methods for the generation of orthogonal transmit signals, where equidistant subcarrier interleaving (ESI) is the most popular multiplexing method. In this work, we analyze a multiplexing method called Doppler-division multiplexing (DDM). This method applies a phase shift from OFDM symbol to OFDM symbol to separate signals transmitted by different Tx antennas along the velocity axis of the range-Doppler map. While general properties of DDM for the task of radar sensing are analyzed in this work, the main focus lies on the implications of DDM on the communication task. It will be shown that for DDM, the channels observed in the communication receiver are heavily timevarying, preventing any meaningful transmission of data when not taken into account. In this work, a communication system designed to combat these time-varying channels is proposed, which includes methods for data estimation, synchronization, and channel estimation. Bit error ratio (BER) simulations demonstrate the superiority of this communications system compared to a system utilizing ESI. △ Less

Submitted 22 March, 2023; originally announced March 2023.

Comments: 13 pages, 11 figures

arXiv:2211.06054 [pdf, other]

Neural Network Approaches for Data Estimation in Unique Word OFDM Systems

Authors: Stefan Baumgartner, Gergő Bognár, Oliver Lang, Mario Huemer

Abstract: Data estimation is conducted with model-based estimation methods since the beginning of digital communications. However, motivated by the growing success of machine learning, current research focuses on replacing model-based data estimation methods by data-driven approaches, mainly neural networks (NNs). In this work, we particularly investigate the incorporation of existing model knowledge into d… ▽ More Data estimation is conducted with model-based estimation methods since the beginning of digital communications. However, motivated by the growing success of machine learning, current research focuses on replacing model-based data estimation methods by data-driven approaches, mainly neural networks (NNs). In this work, we particularly investigate the incorporation of existing model knowledge into data-driven approaches, which is expected to lead to complexity reduction and / or performance enhancement. We describe three different options, namely "model-inspired'' pre-processing, choosing an NN architecture motivated by the properties of the underlying communication system, and inferring the layer structure of an NN with the help of model knowledge. Most of the current publications on NN-based data estimation deal with general multiple-input multiple-output communication (MIMO) systems. In this work, we investigate NN-based data estimation for so-called unique word orthogonal frequency division multiplexing (UW-OFDM) systems. We highlight differences between UW-OFDM systems and general MIMO systems one has to be aware of when using NNs for data estimation, and we introduce measures for successful utilization of NN-based data estimators in UW-OFDM systems. Further, we investigate the use of NNs for data estimation when channel coded data transmission is conducted, and we present adaptions to be made, such that NN-based data estimators provide satisfying performance for this case. We compare the presented NNs concerning achieved bit error ratio performance and computational complexity, we show the peculiar distributions of their data estimates, and we also point out their downsides compared to model-based equalizers. △ Less

Submitted 11 November, 2022; originally announced November 2022.

arXiv:2207.04835 [pdf, other]

Homogeneity Enforced Calibration of Stage Nonidealities for Pipelined ADCs

Authors: Matthias Wagner, Oliver Lang, Thomas Bauernfeind, Mario Huemer

Abstract: Pipelined analog-to-digital converters (ADCs) are fundamental components of various signal processing systems requiring high sampling rates and a high linearity. Over the past years, calibration techniques have been intensively investigated to increase the linearity. In this work, we propose an equalization-based calibration technique which does not require knowledge of the ADC input signal for ca… ▽ More Pipelined analog-to-digital converters (ADCs) are fundamental components of various signal processing systems requiring high sampling rates and a high linearity. Over the past years, calibration techniques have been intensively investigated to increase the linearity. In this work, we propose an equalization-based calibration technique which does not require knowledge of the ADC input signal for calibration. For that, a test signal and a scaled version of it are fed into the ADC sequentially, while only the corresponding output samples are used for calibration. Several test signal sources are possible, such as a signal generator (SG) or the system application (SA) itself. For the latter case, the presented method corresponds to a background calibration technique. Thus, slowly changing errors are tracked and calibrated continuously. Because of the low computational complexity of the calibration technique, it is suitable for an on-chip implementation. Ultimately, this work contains an analysis of the stability and convergence behavior as well as simulation results. △ Less

Submitted 11 July, 2022; originally announced July 2022.

arXiv:2104.13369 [pdf, other]

Explaining in Style: Training a GAN to explain a classifier in StyleSpace

Authors: Oran Lang, Yossi Gandelsman, Michal Yarom, Yoav Wald, Gal Elidan, Avinatan Hassidim, William T. Freeman, Phillip Isola, Amir Globerson, Michal Irani, Inbar Mosseri

Abstract: Image classification models can depend on multiple different semantic attributes of the image. An explanation of the decision of the classifier needs to both discover and visualize these properties. Here we present StylEx, a method for doing this, by training a generative model to specifically explain multiple attributes that underlie classifier decisions. A natural source for such attributes is t… ▽ More Image classification models can depend on multiple different semantic attributes of the image. An explanation of the decision of the classifier needs to both discover and visualize these properties. Here we present StylEx, a method for doing this, by training a generative model to specifically explain multiple attributes that underlie classifier decisions. A natural source for such attributes is the StyleSpace of StyleGAN, which is known to generate semantically meaningful dimensions in the image. However, because standard GAN training is not dependent on the classifier, it may not represent these attributes which are important for the classifier decision, and the dimensions of StyleSpace may represent irrelevant attributes. To overcome this, we propose a training procedure for a StyleGAN, which incorporates the classifier model, in order to learn a classifier-specific StyleSpace. Explanatory attributes are then selected from this space. These can be used to visualize the effect of changing multiple attributes per image, thus providing image-specific explanations. We apply StylEx to multiple domains, including animals, leaves, faces and retinal images. For these, we show how an image can be modified in different ways to change its classifier output. Our results show that the method finds attributes that align well with semantic ones, generate meaningful image-specific explanations, and are human-interpretable as measured in user-studies. △ Less

Submitted 1 September, 2021; v1 submitted 27 April, 2021; originally announced April 2021.

Comments: Accepted to ICCV 2021. Project page: https://explaining-in-style.github.io/, Code: https://github.com/google/explaining-in-style

arXiv:2002.12764 [pdf, other]

doi 10.21437/Interspeech.2020-1242

Towards Learning a Universal Non-Semantic Representation of Speech

Authors: Joel Shor, Aren Jansen, Ronnie Maor, Oran Lang, Omry Tuval, Felix de Chaumont Quitry, Marco Tagliasacchi, Ira Shavitt, Dotan Emanuel, Yinnon Haviv

Abstract: The ultimate goal of transfer learning is to reduce labeled data requirements by exploiting a pre-existing embedding model trained for different datasets or tasks. The visual and language communities have established benchmarks to compare embeddings, but the speech community has yet to do so. This paper proposes a benchmark for comparing speech representations on non-semantic tasks, and proposes a… ▽ More The ultimate goal of transfer learning is to reduce labeled data requirements by exploiting a pre-existing embedding model trained for different datasets or tasks. The visual and language communities have established benchmarks to compare embeddings, but the speech community has yet to do so. This paper proposes a benchmark for comparing speech representations on non-semantic tasks, and proposes a representation based on an unsupervised triplet-loss objective. The proposed representation outperforms other representations on the benchmark, and even exceeds state-of-the-art performance on a number of transfer learning tasks. The embedding is trained on a publicly available dataset, and it is tested on a variety of low-resource downstream tasks, including personalization tasks and medical domain. The benchmark, models, and evaluation code are publicly released. △ Less

Submitted 6 August, 2020; v1 submitted 25 February, 2020; originally announced February 2020.

Journal ref: Proceedings of INTERSPEECH 2020

arXiv:1907.13511 [pdf, other]

doi 10.21437/Interspeech.2019-1427

Personalizing ASR for Dysarthric and Accented Speech with Limited Data

Authors: Joel Shor, Dotan Emanuel, Oran Lang, Omry Tuval, Michael Brenner, Julie Cattiau, Fernando Vieira, Maeve McNally, Taylor Charbonneau, Melissa Nollstadt, Avinatan Hassidim, Yossi Matias

Abstract: Automatic speech recognition (ASR) systems have dramatically improved over the last few years. ASR systems are most often trained from 'typical' speech, which means that underrepresented groups don't experience the same level of improvement. In this paper, we present and evaluate finetuning techniques to improve ASR for users with non-standard speech. We focus on two types of non-standard speech:… ▽ More Automatic speech recognition (ASR) systems have dramatically improved over the last few years. ASR systems are most often trained from 'typical' speech, which means that underrepresented groups don't experience the same level of improvement. In this paper, we present and evaluate finetuning techniques to improve ASR for users with non-standard speech. We focus on two types of non-standard speech: speech from people with amyotrophic lateral sclerosis (ALS) and accented speech. We train personalized models that achieve 62% and 35% relative WER improvement on these two groups, bringing the absolute WER for ALS speakers, on a test set of message bank phrases, down to 10% for mild dysarthria and 20% for more serious dysarthria. We show that 71% of the improvement comes from only 5 minutes of training data. Finetuning a particular subset of layers (with many fewer parameters) often gives better results than finetuning the entire model. This is the first step towards building state of the art ASR models for dysarthric speech. △ Less

Submitted 31 July, 2019; originally announced July 2019.

Comments: 5 pages

arXiv:1807.04051 [pdf, other]

doi 10.1109/TMTT.2019.2896513

A Robust Nonlinear RLS Type Adaptive Filter for Second-Order-Intermodulation Distortion Cancellation in FDD LTE and 5G Direct Conversion Transceivers

Authors: Andreas Gebhard, Oliver Lang, Michael Lunglmayr, Christian Motz, Ram Sunil Kanumalli, Christina Auer, Thomas Paireder, Matthias Wagner, Harald Pretl, Mario Huemer

Abstract: Transceivers operating in frequency division duplex experience a transmitter leakage (TxL) signal into the receiver due to the limited duplexer stop-band isolation. This TxL signal in combination with the second-order nonlinearity of the receive mixer may lead to a baseband (BB) second-order intermodulation distortion (IMD2) with twice the transmit signal bandwidth. In direct conversion receivers,… ▽ More Transceivers operating in frequency division duplex experience a transmitter leakage (TxL) signal into the receiver due to the limited duplexer stop-band isolation. This TxL signal in combination with the second-order nonlinearity of the receive mixer may lead to a baseband (BB) second-order intermodulation distortion (IMD2) with twice the transmit signal bandwidth. In direct conversion receivers, this nonlinear IMD2 interference may cause a severe signal-to-interference-plus-noise ratio degradation of the wanted receive signal. This contribution presents a nonlinear Wiener model recursive least-squares (RLS) type adaptive filter for the cancellation of the IMD2 interference in the digital BB. The included channel-select-, and DC-notch filter at the output of the proposed adaptive filter ensure that the provided IMD2 replica includes the receiver front-end filtering. A second, robust version of the nonlinear RLS algorithm is derived which provides numerical stability for highly correlated input signals which arise in e.g. LTE-A intra-band multi-cluster transmission scenarios. The performance of the proposed algorithms is evaluated by numerical simulations and by measurement data. △ Less

Submitted 11 July, 2018; originally announced July 2018.

arXiv:1804.03619 [pdf, other]

doi 10.1145/3197517.3201357

Looking to Listen at the Cocktail Party: A Speaker-Independent Audio-Visual Model for Speech Separation

Authors: Ariel Ephrat, Inbar Mosseri, Oran Lang, Tali Dekel, Kevin Wilson, Avinatan Hassidim, William T. Freeman, Michael Rubinstein

Abstract: We present a joint audio-visual model for isolating a single speech signal from a mixture of sounds such as other speakers and background noise. Solving this task using only audio as input is extremely challenging and does not provide an association of the separated speech signals with speakers in the video. In this paper, we present a deep network-based model that incorporates both visual and aud… ▽ More We present a joint audio-visual model for isolating a single speech signal from a mixture of sounds such as other speakers and background noise. Solving this task using only audio as input is extremely challenging and does not provide an association of the separated speech signals with speakers in the video. In this paper, we present a deep network-based model that incorporates both visual and auditory signals to solve this task. The visual features are used to "focus" the audio on desired speakers in a scene and to improve the speech separation quality. To train our joint audio-visual model, we introduce AVSpeech, a new dataset comprised of thousands of hours of video segments from the Web. We demonstrate the applicability of our method to classic speech separation tasks, as well as real-world scenarios involving heated interviews, noisy bars, and screaming children, only requiring the user to specify the face of the person in the video whose speech they want to isolate. Our method shows clear advantage over state-of-the-art audio-only speech separation in cases of mixed speech. In addition, our model, which is speaker-independent (trained once, applicable to any speaker), produces better results than recent audio-visual speech separation methods that are speaker-dependent (require training a separate model for each speaker of interest). △ Less

Submitted 9 August, 2018; v1 submitted 10 April, 2018; originally announced April 2018.

Comments: Accepted to SIGGRAPH 2018. Project webpage: https://looking-to-listen.github.io

Journal ref: ACM Trans. Graph. 37(4): 112:1-112:11 (2018)

arXiv:1712.02146 [pdf, other]

Knowledge-Aided Kaczmarz and LMS Algorithms

Authors: Michael Lunglmayr, Oliver Lang, Mario Huemer

Abstract: The least mean squares (LMS) filter is often derived via the Wiener filter solution. For a system identification scenario, such a derivation makes it hard to incorporate prior information on the system's impulse response. We present an alternative way based on the maximum a posteriori solution, which allows develo** a Knowledge-Aided Kaczmarz algorithm. Based on this Knowledge-Aided Kaczmarz we… ▽ More The least mean squares (LMS) filter is often derived via the Wiener filter solution. For a system identification scenario, such a derivation makes it hard to incorporate prior information on the system's impulse response. We present an alternative way based on the maximum a posteriori solution, which allows develo** a Knowledge-Aided Kaczmarz algorithm. Based on this Knowledge-Aided Kaczmarz we formulate a Knowledge-Aided LMS filter. Both algorithms allow incorporating the prior mean and covariance matrix on the parameter to be estimated. The algorithms use this prior information in addition to the measurement information in the gradient for the iterative update of their estimates. We analyze the convergence of the algorithms and show simulation results on their performance. As expected, reliable prior information allows improving the performance of the algorithms for low signal-to-noise (SNR) scenarios. The results show that the presented algorithms can nearly achieve the optimal maximum a posteriori (MAP) performance. △ Less

Submitted 6 December, 2017; originally announced December 2017.

Showing 1–14 of 14 results for author: Lang, O