HarmonICA: Neural non-stationarity correction and source separation for motor neuron interfaces

Alexander Kenneth Clarke
Imperial College London
London, UK
[email protected]
&Agnese Grison
Imperial College London
London, UK
[email protected]
\ANDIrene Mendez Guerra
Imperial College London
London, UK
[email protected]
&Pranav Mamidanna
Imperial College London
London, UK
[email protected]
&Shihan Ma
Imperial College London
London, UK
[email protected]
&Silvia Muceli
Chalmers University of Technology
Gothenburg, Sweden
[email protected]
&Dario Farina
Imperial College London
London, UK
[email protected]

Abstract

A major outstanding problem when interfacing with spinal motor neurons is how to accurately compensate for non-stationary effects in the signal during source separation routines, particularly when they cannot be estimated in advance. This forces current systems to instead use undifferentiated bulk signal, which limits the potential degrees of freedom for control. In this study we propose a potential solution, using an unsupervised learning algorithm to blindly correct for the effects of latent processes which drive the signal non-stationarities. We implement this methodology within the theoretical framework of a quasilinear version of independent component analysis (ICA). The proposed design, HarmonICA, sidesteps the identifiability problems of nonlinear ICA, allowing for equivalent predictability to linear ICA whilst retaining the ability to learn complex nonlinear relationships between non-stationary latents and their effects on the signal. We test HarmonICA on both invasive and non-invasive recordings both simulated and real, demonstrating an ability to blindly compensate for the non-stationary effects specific to each, and thus to significantly enhance the quality of a source separation routine.

1 Introduction

Great strides have been made in the miniaturisation and reliability of high density electrode arrays for capturing surface and intramuscular electromyography (EMG) signals[1, 2]. With source separation algorithms, these recordings give direct access to the spinal motor neuron activity and so provide an excellent target for human computer interfacing applications. Reliable access to individual motor neurons offers the potential for a new generation of intuitive prosthetics and may even allow the extra degrees of freedom needed to control additional robotic limbs[3, 4, 5].

Despite these opportunities, a significant remaining obstacle to the development of interfaces based on individual neural activity is the brittleness of current decomposition methods to non-stationarities in the signal, which limits the majority of spinal interfaces to inference at the population level. These are the result of changes in the spatial position of the electrodes relative to the neuron and fluctuations in the volume conductor through which the biopotential is propagating. The former, called drift, can be estimated and corrected by interpolation methods for small movements in the plane of the electrode array, which is the basis of modern intracortical spike sorting algorithms[6, 7].

Outside of this specific scenario, when there is drift in multiple non-planar directions or volume conductor change, it remains extremely difficult to compensate for the complex resultant variation in the action potential waveforms over time during the source separation routine. There have been some recent successes using sliding windows and post hoc correction to handle subtle variation over time[8, 9], but these approaches often fail when neurons are not consistently active in all windows or the non-stationary changes occur too rapidly. This has a strong limiting effect on experimental design, for example almost all spinal neuron recordings made using surface EMG restrict the subject to a fixed isometric position to prevent volume conductor variation.

In this work we propose HarmonICA, a deep learning model for source separation of non-stationary neurophysiological time series using quasi-linear independent component analysis (ICA). By using an alternating backpropagation strategy to revert to a linear model during source estimation, we sidestep the identifiability problems of true non-linear blind source separation. This gives the model the predictability of linear ICA, whilst retaining the ability to learn and compensate for the non-linear relationships between time-varying components and their effect on spike waveforms. We go on to demonstrate the efficacy of the model experimentally on both surface and intramuscular EMG, recovering sources with high accuracy on a number of common non-stationary effects. These include, for the first time we are aware of in the literature, recovery of spinal motor neuron activity from non-isometric surface EMG without windowing.

2 Method

2.1 Preliminaries

The data generation process of neurophysiological signal can be modelled as the observed transformed activity of a set of $M$ latents, the spiking neuronal sources, for which we assume independence, i.e:

p(\boldsymbol{s})=\prod_{j=1}^{M}p(\boldsymbol{s}_{j})

(1)

The source vector sample is transformed by a mixing function $\boldsymbol{f}:\boldsymbol{s}\rightarrow\boldsymbol{x}$ into an observation vector sample $\boldsymbol{x}$ detected by the electrode array. We can formulate the blind source separation goal as the unsupervised discovery of $\boldsymbol{f}^{-1}:\boldsymbol{x}\rightarrow\boldsymbol{s}$ , i.e. we want to find a separation function we can apply to the observations that recovers the underlying spiking activity.

The distribution of each spiking source $p(s_{j})$ can be described by:

p(s_{j})=(1-\psi)\delta(s_{j}-c_{null})+\psi\delta(s_{j}-c_{spike})

(2)

where $\delta(.)$ is the dirac delta function, $c_{null}$ and $c_{spike}$ are arbitrary constants of different value and $\psi$ is the probability of spike activation determined by upstream neural drives and the neurons refractory period. The length $T$ recorded time series vector sampled from $p(\boldsymbol{s}_{j})$ can be described by a scaled binary sequence dominated by $c_{null}$ :

\boldsymbol{s}_{j}(t)=\begin{cases}c_{spike}\mspace{18.0mu}\text{if neuron % spikes at }t\\ c_{null}\mspace{18.0mu}\text{otherwise}\end{cases}

(3)

At each time instant $t$ in the recording there exists a length $L$ transfer vector $\boldsymbol{h}_{ij}$ pairing an activation of the $j$ -th neuron with the resultant detected duration $L$ action potential at the $i$ -th electrode assuming adequate time support. $\boldsymbol{h}_{ij}$ can itself be described as the output from a deterministic function $\boldsymbol{g}:\boldsymbol{z_{j}}\rightarrow\boldsymbol{h}_{ij}$ which takes as inputs the values of a latent random vector $\boldsymbol{z_{j}}$ which abstracts concepts like the relative position and alignment of the neuron with respect to the electrode and the makeup of the intervening volume conductor. When $M$ sources have non-zero transfer functions, $\boldsymbol{x}_{i}(t)$ , the resultant summed contributions to the $i$ -th electrode at $t$ can be described by a convolutive mixing model:

\boldsymbol{x}_{i}(t)=\sum_{j=1}^{M}\boldsymbol{h}_{ij}(t)*\boldsymbol{s}_{j}(% t)+\boldsymbol{\omega}_{i}(t)

(4)

where $\boldsymbol{h}_{ij}(t)=\boldsymbol{g}(\boldsymbol{z_{j}}(t))$ and $\boldsymbol{\omega}_{i}(t)$ is an additive noise vector.

We can further simplify to an instantaneous mixing model by extending the observation, source, transfer and noise vectors at each $t$ with $L$ past values, indicated by a tilde $\tilde{.}$ in the below definitions. Generalised to an array of $N$ electrodes, this gives us our full description of the mixing function $\boldsymbol{f}$ :

\boldsymbol{\tilde{x}}(t)=\boldsymbol{H}(t)\boldsymbol{\tilde{s}}(t)+% \boldsymbol{\tilde{\omega}}(t)

(5)

where $\boldsymbol{H}(t)$ is a matrix constructed from the $\boldsymbol{\tilde{h}}^{T}(t)$ extended transfer vector outputs given $\boldsymbol{z}(t)$ for each source, each of which are triggered by $\boldsymbol{\tilde{s}}(t)$ , the extended vector of $N$ binary sources at time $t$ , which we present below in block format:

\begin{split}\boldsymbol{\tilde{s}}(t)=[\boldsymbol{\tilde{s}}_{1}(t),% \boldsymbol{\tilde{s}}_{2}(t),...,\boldsymbol{\tilde{s}}_{j}(t),...,% \boldsymbol{\tilde{s}}_{M}(t)]^{T}\\ \boldsymbol{\tilde{s}}_{j}(t)=[s_{j}(t),s_{j}(t-1),...,s_{j}(t-L+1)]\end{split}

(6)

and finally the summed impulse responses are then observed by $M$ electrodes as the signal $\boldsymbol{\tilde{x}}(t)$ , which we again present in block format:

\begin{split}\boldsymbol{\tilde{x}}(t)=[\boldsymbol{\tilde{x}}_{1}(t),% \boldsymbol{\tilde{x}}_{2}(t),...,\boldsymbol{\tilde{x}}_{i}(t),...,% \boldsymbol{\tilde{x}}_{M}(t)]^{T}\\ \boldsymbol{\tilde{x}}_{i}(t)=[x_{i}(t),x_{i}(t-1),...,x_{i}(t-L+1)]\end{split}

(7)

The bulk of source separation algorithms in the neurophysiological literature require that the mixing matrix be fixed with respect to time as a necessary assumption, meaning we make the assumption $\boldsymbol{z}(0)\approx\boldsymbol{z}(t)\approx\boldsymbol{z}(T)$ which implies $\boldsymbol{H}(0)\approx\boldsymbol{H}(t)\approx\boldsymbol{H}(T)$ . This often means head-fixed intracortical recordings in animal models, or isometric contractions in spinal motor neuron recordings.

When this assumption is violated, source separation quality is usually poor. Variation in the observed action potential waveform causes incorrect class splitting and merging in second order spike sorting algorithms, whilst in higher order blind source separation algorithms these non-stationarities cause optimisation problems and missed spiking events. In recent years, intracortical spike sorting algorithms have begun to incorporate a preprocessing step in their pipelines to correct for a specific type of non-stationarity due to drift in the plane of the linear arrays commonly used for such recordings[10, 11, 6]. This is possible to both estimate and correct because the actual waveform shape is mostly unchanged, just its spatial positioning in the multichannel recording. For other types of drift and for changes due to volume conductor variance, variation in $\boldsymbol{H}$ with respect to time becomes extremely difficult to account for, and source separation in such scenarios is as of this time an unresolved problem.

Note.

In intracortical neurophysiology, the rows of $\boldsymbol{H}(t)$ are simply the extended action potential waveforms associated with each neuronal source and observed by the electrode arrays after modification by the intervening volume conductor. In intramuscular and surface electromyography, these waveforms are instead the summed action potentials of a pool of muscle fibers being triggered at a one-to-one correspondence by a single spinal motor neuron. These differences have significant implications for the waveform characteristics, but the described forward model is nevertheless valid for all of these modalities.

2.2 Independent component analysis

We aim to recover the neuron sources using blind source separation techniques to discover a separation function which inverts the mixing process in equation 4. We will use an ICA framework, in which we take advantage of the independence assumption specified in equation 1 to learn the function $\boldsymbol{w}\approx\boldsymbol{f}^{-1}$ :

\hat{s}(t)=\boldsymbol{w}(\boldsymbol{\tilde{x}}(t);\boldsymbol{\theta})

(8)

where $\boldsymbol{\theta}_{j}$ are the learned parameters of the model, which we minimise the negative log likelihood:

\mathcal{L}_{I}(\boldsymbol{\theta})=-\log p_{\boldsymbol{\hat{s}}}\big{(}% \boldsymbol{w}(\boldsymbol{\tilde{x}};\boldsymbol{\theta})\big{)}-\log|\det J% \big{(}\boldsymbol{w}(\boldsymbol{\tilde{x}}(t);\boldsymbol{\theta})\big{)}|

(9)

which we will proceed to optimise through maximisation of higher order statistical estimators. Here $\det J(.)$ is the determinant of the Jacobian, which ensures the distribution remains normalised. In practice we can drop the Jacobian term by constraining $\boldsymbol{w}$ to be unit norm.

For clarity of explanation, we will initially aim to use a purely linear form of $\boldsymbol{w}$ , i.e. $\boldsymbol{\theta}=\boldsymbol{W}$ , which reduces equation 8 to:

\hat{s}(t)=\boldsymbol{W}\boldsymbol{\tilde{x}}(t)

(10)

and equation 9 to:

\mathcal{L}_{I}(\boldsymbol{W})=-\log p_{\boldsymbol{\hat{s}}}\big{(}% \boldsymbol{W}\boldsymbol{\tilde{x}}\big{)}-\log|\det J\big{(}\boldsymbol{W}% \big{)}|

(11)

For $\log p_{\boldsymbol{\hat{s}}}$ we want an estimator that is maximal when the distributions are independent, which by well known theory is analogous to maximising an estimator of a higher order statistic[12]. Spiking sources are extremely sparse, supergaussian and asymmetric, so the polynomial cumulant estimator $\log p_{\boldsymbol{\hat{s}}}\approx sign(\boldsymbol{\hat{s}})\boldsymbol{% \hat{s}}^{|e|}$ works well, where $e$ is a exponent hyperparameter and $sign(.)$ returns the element-wise sign. Assuming a constrained optimisation and using a projection pursuit routine to extract one source at a time, this gives our final independence loss for the $j$ -th source:

\mathcal{L}_{I}(\boldsymbol{w}_{j})=-\mathbb{E}\big{[}sign(\boldsymbol{\hat{s}% }_{j})\boldsymbol{\hat{s}}_{j}^{|e|}\big{]}

(12)

where $\mathbb{E}$ is the expectation.

2.3 Estimating a non-stationarity

Under non-stationary conditions and given an optimal set of $\boldsymbol{\theta}_{j}$ , the predicted source distribution $p(\boldsymbol{\hat{s}}_{j})$ converges on the true distribution $p(\boldsymbol{s}_{j})$ . However, when $\boldsymbol{z}$ and therefore $\boldsymbol{H}$ vary with time, the fixed $\boldsymbol{W}$ will not be able to learn to invert the mixing model, and the spike amplitudes of $\boldsymbol{\hat{s}}_{j}$ will become modulated based on the unknown form of $\boldsymbol{g}$ . This leads to either convergence on only parts of the true $p(\boldsymbol{s}_{j})$ or complete optimisation failure.

To measure and correct for this modulation we take advantage of our prior knowledge on the distribution of $p(\boldsymbol{s}_{j})$ as specified in equation 2. We first relax the deterministic component to give a two-distribution Gaussian mixture model (GMM) fitted to the output time series of predictions on $\boldsymbol{\hat{s}}_{j}$ :

q(\boldsymbol{\hat{s}}_{j})=(1-\pi)\mathcal{N}(\hat{s}_{j}\mid c_{null};\sigma% _{null}^{2})+\pi\mathcal{N}(\hat{s}_{j}\mid c_{spike};\sigma_{spike}^{2})

(13)

where $\psi$ in equation 2 is modelled by a mixing coefficient $\pi$ and Gaussian priors of variance $\sigma_{null}^{2}$ and $\sigma_{spike}^{2}$ placed over the null and spike values.

Given an optimal set of $\boldsymbol{\theta}_{j}$ and assuming a well fitted GMM and no noise term, the only source of variation in the null and spike Gaussian distributions is the unexplained variance caused by a non-stationary $\boldsymbol{z}$ . To measure this we can simply take the summed estimated variance of the Gaussian distributions $\sigma_{null}^{2}$ and $\sigma_{spike}^{2}$ as our second loss term, giving the loss function:

\mathcal{L}_{N}(\boldsymbol{\theta}_{j})=\lambda(\sigma_{null}^{2}+\sigma_{% spike}^{2})^{\frac{1}{2}}

(14)

To avoid fitting a GMM to a long time series with many samples, we instead use an efficient peak finding algorithm to subselect likely members of the null and spike Gaussian distributions with which to initialise the model. We then run a small number of iterations of the expectation-maximisation algorithm to get the final GMM parameters.

2.4 HarmonICA: A quasilinear ICA model

Of course we have not yet given the model the ability to account for variation in $\boldsymbol{H}$ with time. As the relationship between $\boldsymbol{z}$ and $\boldsymbol{H}$ can be highly nonlinear, we would ideally like to replace our linear model and use a universal function estimator such as a deep neural network for the model predicting the $j$ -th source $\boldsymbol{w}_{j}$ . However, this immediately causes problems by reducing the identifiability of the ICA model, i.e. optimal versions of $\boldsymbol{\theta}_{j}$ will give different predictions on the same source[13]. We sidestep this problem with the following model construction, which we name Harmonising Independent Component Analysis or HarmonICA.

Given a neural network ${N\!N}_{j}$ parameterised by $\boldsymbol{\phi}_{j}$ , we separate out the matrix multiplication of the final linear layer, $\boldsymbol{W_{j}}$ . The purpose of ${N\!N}_{j}$ is, given $t$ , to learn the parameters of an affine transformation of $\boldsymbol{W_{j}}$ which compensates for the time-varying $\boldsymbol{H}$ . The modified $\boldsymbol{W_{j}}$ is then directly applied to $\boldsymbol{\tilde{x}}(t)$ to give $\hat{s}_{j}(t)$ :

\begin{split}\hat{s}_{j}(t)=\big{(}\boldsymbol{a}\otimes\boldsymbol{W_{j}}\big% {)}\boldsymbol{\tilde{x}}(t)+\boldsymbol{b}\\ \log(\boldsymbol{a}),\boldsymbol{b}=split\big{(}\text{{NN}}_{j}(\boldsymbol{% \gamma(t);\boldsymbol{\phi}_{j})}\big{)}\end{split}

(15)

where $\otimes$ is the element-wise sum on the tiled $\boldsymbol{W_{j}}$ and $split(.)$ divides the outputs of ${N\!N}_{j}$ into bias and log weight. Inspired by Neural Radiance Fields[14], we found that using a multivariate sinusoidal positional encoding for $t$ gave effective generalisation performance to a variety of non-stationary modulations:

\gamma(t)=\left(\sin(\alpha\pi t),\sin((\alpha+1)\pi t),...,\sin((\alpha+\beta% )\pi t)\right)

(16)

During training the model is optimised using an alternating backpropagation methodology on $\mathcal{L}_{I}(\boldsymbol{W}_{j})$ and then $\mathcal{L}_{N}(\boldsymbol{\phi}_{j})$ . By freezing ${N\!N}_{j}$ during the independence loss update, we prevent nonidentifiability problems whilst still allowing the model to learn complex non-stationary relationships.

The full algorithm is specified in Algorithm 26. After a round of optimisation a quality metric is calculated on the predicted $j$ -th source, for which we used a pseudo-silhouette measure. This was found by using a hard assignment on the per-sample membership probabilities after fitting the GMM, then calculating the average distance of each sample to its assigned distribution mean versus the other mean. If this met a threshold of, the source was added to the accepted sources and its contribution to $\boldsymbol{\tilde{x}}$ estimated and subtracted, a commonly used "peel off" routine used to prevent convergence on a previously found source in projection pursuit. Estimation used a local spike triggered average routine.

Algorithm 1 Harmonising Independent Component Analysis

\boldsymbol{\tilde{x}}

: length

T

EMG signal with

N

channels - filtered, extended and whitened

\boldsymbol{\gamma}

: length

T

sinusoidal positional encoding of time as per equation 16

\hat{\boldsymbol{s}}

: length

T

estimated spinal neuron activity of

M

sources

4:while sources in

\boldsymbol{\tilde{x}}

5: Initialize: Randomly initialize

\boldsymbol{W}_{j}

and

\boldsymbol{\sigma}_{j}

6: Initialize: Initialize elements of

\boldsymbol{a}

to 1 and

\boldsymbol{b}

to 0

7: while

\boldsymbol{W}_{j}

and

\boldsymbol{\sigma}_{j}

not converged do

8: Independence Update

\boldsymbol{W}^{U}_{j}\leftarrow\boldsymbol{a}\otimes\boldsymbol{W}_{j}+% \boldsymbol{b}

10:

\hat{\boldsymbol{s}}_{j}\leftarrow\boldsymbol{W}^{U}_{j}\boldsymbol{\tilde{x}}

11: Calculate gradient of equation 12 w.r.t.

\boldsymbol{W}_{j}

12: Update

\boldsymbol{W}_{j}

with its gradient

13: Non-stationarity Compensation Update

14:

\log(\boldsymbol{a}),\boldsymbol{b}\leftarrow\textit{NN}_{j}(\boldsymbol{% \gamma};\boldsymbol{\phi}_{j})

15:

\boldsymbol{W}^{U}_{j}\leftarrow\boldsymbol{a}\otimes\boldsymbol{W}_{j}+% \boldsymbol{b}

16:

\hat{\boldsymbol{s}}_{j}\leftarrow\boldsymbol{W}^{U}_{j}\boldsymbol{\tilde{x}}

17: Fit GMM to

\hat{\boldsymbol{s}}_{j}

as per equation 13

18: Calculate gradient of equation 14 w.r.t.

\boldsymbol{\phi}_{j}

19: Update

\boldsymbol{\phi}_{j}

with its gradient

20: end while

21: Compute quality metric on

\hat{\boldsymbol{s}}_{j}

22: if quality metric meets threshold then

23: Estimate effect of

\hat{\boldsymbol{s}}_{j}

\boldsymbol{\tilde{x}}

and peel off

24: Add

\hat{\boldsymbol{s}}_{j}

\hat{\boldsymbol{s}}

25: end if

26:end while

2.5 HarmonICA in practice

For effective operation on neurophysiological time series, a number of beneficial steps were taken to improve the likelihood of convergence on good sources. In the preprocessing phase the signal was first digitally filtered, and then after extension whitened using the ZCA method.

We found that the number of update steps needed for convergence could be greatly reduced if $\boldsymbol{W}_{j}$ had begun to converge on a source prior to running the full optimisation routine. This could be achieved by either initialising $\boldsymbol{W}_{j}$ with a high amplitude sample from $\boldsymbol{\tilde{x}}$ [15], or by simply running a pre-training phase of a number of independence update steps before starting the compensation updates. In practice we used a combination of both. During training, the parameters of the compensation network $\boldsymbol{\sigma}_{j}$ were initialised to by draws from a zero mean gaussian with a small standard deviation, which gave low output values of $\log(\boldsymbol{a})$ and $\boldsymbol{b}$ prior to the network being updated. This prevented a big "kick" to a good initialisation of $\boldsymbol{W}_{j}$ when the non-stationarity compensation updates start. The RMSprop algorithm was used to control the gradient descent routine for both the independence and compensation updates. Figure 1 gives some intuition to what this process looks like in practice.

After a source was accepted, it was immediately converted to spike timestamps using the same GMM process. Occasionally the peel-off routine would fail to prevent convergence on a previously found source. This would be identified by calculating an accuracy statistic between the new timestamp set and the timestamp sets of previous sources:

Accuracy=\frac{\tau}{\tau+\upsilon^{1}+\upsilon^{2}}\%

(17)

where $\tau$ are the number of timestamp matches within a 5ms tolerance window and $\upsilon^{1}$ and $\upsilon^{2}$ are the unmatched timestamps in the new source and an old source respectively. When this met a certain threshold the source would not be added to the accepted sources, but its contribution to the signal would still be estimated and peeled off to prevent reconvergence.

Refer to caption — Figure 1: HarmonICA in operation. a After an initial pretraining phase, the linear separation vector is converged on a single source. However it lacks the capacity to account for the non-stationary latents in its source prediction. b Using alternating backpropagation, HarmonICA cycles between improving the separation vector and training a neural network to adapt the vector such that it accounts for non-stationarities at each time point. c At the end of training, the source prediction is no longer modulated by the non-stationary latents.

3 Experiments

3.1 Real intramuscular EMG with synthetic drift

HarmonICA was tested on the common intramuscular EMG non-stationarity of probe drift using real intramuscular data from a previous study[16]. For this a 10-second recording was taken from a 30% maximum voluntary contraction ankle dorsiflexion dataset sampled at 10 kHz from 32 channels inserted in the Tibialis Anterior muscle of one participant under isometric conditions. Force feedback was shown on a monitor to the participants. The dataset included a manual decomposition of the signal[17],and so had the timestamps of each motor unit active which could be used for a ground truth. Alongside the original static set, sinusoidal channel displacements of amplitude 2mm at periods of 1, 0.5, and 0.25 times the signal duration were embedded into this 32-channel dataset, as well as a superposition of a period 1 drift and period 0.25 drift. The original study was conducted with informed consent from participants, was Institutional Review Board reviewed and conformed to the Declaration of Helsinki.

3.2 Simulated non-isometric surface EMG

The proposed model was also validated on a simulation of surface EMG recorded under non-isometric conditions[18]. The time-varying volume conductor imposes a complex non-stationarity of the activation potentials in the signal which cannot be estimated using current linear methods. The neuromechanical model uses neurophysiological parameters inferred from a finite element method simulation[19], and musculoskeletal parameters from a forward dynamics simulation[20], integrating both into accurate predictions of how EMG is modulated by dynamic contractions. The output action potential waveforms are convolved with simulated motor neuron activity, whose timestamps also form the ground truth set. The flexor digitorum superficialis was activated at 30% constant neural excitation, resulting in 26 active motor neurons recorded under isometric and dynamic conditions. The generated surface EMG signals corresponded to a 30 second recording from a square 64-channel array placed on top of the muscle at the proximal third of the forearm with 2048 Hz sampling frequency with a 30dB signal-to-noise ratio. In the dynamic condition, non-stationarities in the surface EMG signals were induced by the flexion and extension of the wrist (up to 40° in each direction). These movements followed sinusoids, again with periods equal to 1, 0.5, and 0.25 of the signal length, alongside a superposition of 1 and 0.25.

3.3 Real non-isometric surface EMG

Real isometric and dynamic recordings from a previous study were also used in validation[21]. Briefly, surface EMG signals from a square 64-channel array were sampled at 2048 Hz from the forearm flexor muscles of one subject. During the experiment, the participant was asked to perform isometric finger presses in a custom dynamometer, with feedback given on screen. Isometric contractions were recorded at 15% of their maximum voluntary contraction at a fixed neutral wrist position over 30s. This finger task was repeated while a motor moved the participant’s hand from the neutral position to 30° wrist flexion in a staircase pattern (5s isometric phases, followed by 10s ramps with an incline of 1°/s). The latter induced changes in the muscle length and volume conductor surrounding the sensor arrays. As the movement contained isometric portions of signal, these could be decomposed by linear ICA to give a ground truth in these segments. By carefully extending into the dynamic portion of the movement, the full time series of the source activity could be reconstructed by calculating the agreement between the timestamps in the overlap** segments. Becuase it was impossible to tell if a source was not present in a segment because the linear ICA had failed or the unit had genuinely gone silent, only sources which were present through the recording were used as ground truth. As before, the original study was conducted with informed consent from participants, was Institutional Review Board reviewed and conformed to the Declaration of Helsinki.

3.4 Implementation details

During preprocessing, intramuscular EMG was high pass filtered at 100Hz and then extended by 100 past values, whilst surface EMG was bandpassed between 10Hz and 1000Hz before extension by 50 past values. Full configs are available in the code repository https://anonymous.4open.science/r/HarmonICA-FB56/.

Pytorch was used for model implementation, with each recording processed on a single RTX6000 GPU for about 20 seconds per source, good or bad, with around 100 source finding attempts. Both RMSprop optimisers used default values except for learning rates of 0.0001 and 0.00001 on intramuscular and surface EMG respectively. Training was terminated when a median interspike interval statistic stopped decreasing. A silhouette threshold of 0.9 was used for both signal types when deciding whether to keep a source. To prevent sources collapsing on a single sample, the exponent $e$ in equation 12 was set to an inital value of 2 and then incremented to 6 during the independence pretraining phase. Alongside this, the source was clamped during independence updates to be the median value of the top 30 samples.

The non-stationarity compensation neural network architecture was parameterised as a feed-forward neural network with a ReLU activation, batchnorm and dropout between each hidden layers. There were three such layers, each with a hidden dimension of 64. No activation was placed on the final layer. Positional encoding was with zero phase sine waves with periods set as a fractions of the total length of the time series, we used a 5th, 7th, 9th, 11th and 13th for all studies.

A 1d indexed max pool was used for efficient peak finding for collecting samples with which to fit the GMM. Peaks over a height of 3 standard deviations of the source were initially labelled as spikes, with their average and standard deviation used to initialise those parameters in the spike gaussian distribution in the GMM. Statistics calculated on 1000 other samples were used to initialise the null gaussian distribution. The GMM was then trained by expectation maximisation for 5 steps.

Table 1: Performance of HarmonICA, cBSS[22] and ablation on drifted intramuscular EMG

Drift Period (signal lengths)	Accuracy % (mean $\pm$ std)	Number of Units > 80% (mean $\pm$ std)	Number of Units < 80% (mean $\pm$ std)
HarmonICA
None	97.00 $\pm$ 0.59	23.60 $\pm$ 0.55	0.20 $\pm$ 0.45
1	94.60 $\pm$ 11.90	12.60 $\pm$ 1.14	1.20 $\pm$ 0.84
0.5	94.80 $\pm$ 12.50	8.00 $\pm$ 0.00	0.60 $\pm$ 0.55
0.25	88.80 $\pm$ 17.50	6.20 $\pm$ 0.84	1.40 $\pm$ 1.34
1 + 0.25	90.30 $\pm$ 0.89	4.20 $\pm$ 1.79	0.40 $\pm$ 0.89
cBSS[22] (deterministic)
None	98.80 $\pm$ 1.80	14	0
1	16.40 $\pm$ 12.70	0	18
0.5	14.50 $\pm$ 12.20	0	20
0.25	16.80 $\pm$ 11.60	0	19
1 + 0.25	21.00 $\pm$ 8.38	0	6
HarmonICA with Ablated Compensation
None	98.70 $\pm$ 1.70	16.20 $\pm$ 1.48	0.00 $\pm$ 0.00
1	0.00 $\pm$ 0.00	0.00 $\pm$ 0.00	0.00 $\pm$ 0.00
0.5	0.00 $\pm$ 0.00	0.00 $\pm$ 0.00	0.00 $\pm$ 0.00
0.25	0.00 $\pm$ 0.00	0.00 $\pm$ 0.00	0.00 $\pm$ 0.00
1 + 0.25	0.00 $\pm$ 0.00	0.00 $\pm$ 0.00	0.00 $\pm$ 0.00

4 Results

4.1 Intramuscular EMG

HarmonICA was compared to a modern motor unit decomposition algorithm, convolutive blind source separation (cBSS)[22], which is commonly used to extract spinal motor neuron activity from both intramuscular and surface EMG signal. It was also compared to an ablated version of itself, without the compensation module.

HarmonICA outperformed the baseline cBSS in all conditions (see Table 1). In the presence of non-stationarities in the signal, cBSS fails to find even a single unit as expected for a purely linear method, whilst the proposed method reliably identifies units, with performance gracefully decreasing in increasingly non-stationary conditions. Notably, this effect is only realized due to the proposed compensation mechanism, as evidenced by the ablation study.

4.2 Simulated Surface EMG

Similar trends were observed in the case of simulated surface EMG signals, where HarmonICA outperformed cBSS in all conditions (see Table 2).

Table 2: Performance of HarmonICA, cBSS[22] and ablation on simulated surface EMG

Non-isometry Period (signal lengths)	Accuracy % (mean $\pm$ std)	Number of Units > 80% (mean $\pm$ std)	Number of Units < 80% (mean $\pm$ std)
HarmonICA
None	98.40 $\pm$ 2.66	11.80 $\pm$ 0.45	0.00 $\pm$ 0.00
1	92.22 $\pm$ 1.50	8.67 $\pm$ 0.58	1.33 $\pm$ 1.53
0.50	85.60 $\pm$ 2.74	8.00 $\pm$ 8.00	2.67 $\pm$ 1.53
0.25	91.20 $\pm$ 1.26	8.33 $\pm$ 1.16	1.33 $\pm$ 2.31
1 + 0.25	61.90 $\pm$ 4.01	5.00 $\pm$ 0.00	8.00 $\pm$ 0.00
cBSS[22] (deterministic)
None	99.60 $\pm$ 0.30	10	0
1	15.00 $\pm$ 17.80	0	9
0.5	19.80 $\pm$ 19.30	0	18
0.25	22.60 $\pm$ 20.50	0	16
1 + 0.25	13.90 $\pm$ 15.00	0	17
HarmonICA with Ablated Compensation
None	99.49 $\pm$ 0.33	11.60 $\pm$ 0.55	0.00 $\pm$ 0.00
1	0.00 $\pm$ 0.00	0.00 $\pm$ 0.00	0.00 $\pm$ 0.00
0.5	0.00 $\pm$ 0.00	0.00 $\pm$ 0.00	0.00 $\pm$ 0.00
0.25	0.00 $\pm$ 0.00	0.00 $\pm$ 0.00	0.00 $\pm$ 0.00
1 + 0.25	0.00 $\pm$ 0.00	0.00 $\pm$ 0.00	0.00 $\pm$ 0.00

4.3 Experimental Surface EMG

On the fully isometric surface EMG recording, HarmonICA matched the results of cBSS closely, although it consistently found two units across the seeds which were not in the cBSS yield of 29 motor neurons. Over 5 seeds, HarmonICA matched a mean 25.8 of cBSS’s units with an average accuracy of 97.31% (std 2.72).

For the two dynamic surface EMG recordings there was no true ground truth, but sources found on small near-isometric potions were carefully curated and then used to reconstruct a full time series over the dynamic contraction. Using this method, 6 sources which traversed the entire time series were found in the first recording, and 7 in the second. As with the simulations, and consistent with theory, neither cBSS nor the ablated version of HarmonICA were able to find any units. However, the full version of HarmonICA was able to match 5.2 (std 0.45) of the 6 tracked motor neurons in the first recording, at an average accuracy of 98.66% (std 2.95). In the second recording HarmonICA matched 6.2 (std 0.45) of the 7 units at an accuracy of 96.98% (std 3.66).

5 Limitations

The main limitation of this work is the difficulty of validating neurophysiological blind source separation algorithms in real data. Whilst modern full and hybrid simulations have high fidelity to the real thing, they are ideally only one half of the validation pipeline. We hope in future studies to build a set of source-matched paired surface and intramuscular EMG data during dynamic contractions which would form a gold standard for such studies. We also hope to better understand the how resilient HarmonICA is to different noise conditions. Finally a major limitation of HarmonICA is its computational efficiency, requiring the VRAM and capacities of modern GPUs to build a set of separation vectors and compensation networks. Whilst, by design, the source separation model is more robust to non-stationarity and so will need less recalculation, modifications will be needed to make the pipeline efficient enough to meet interfacing requirements.

6 Conclusion

We present a practical approach to the direct decomposition of non-stationary EMG into its constituent motor neuron activity without windowing or post hoc correction, including for the first time in the literature the results of direct source separation in surface EMG during a dynamic contraction. HarmonICA has the expressive power to learn the complex non-stationarities caused by common system changes in neurophysiological recordings, whilst retaining the reliability of linear ICA. Our future work will focus on expanding EMG decomposition to more freeform and naturalistic movements, which if successful would form a future control system for motor neuron interfacing systems. We are also interested in seeing how the proposed methodology translates in other neurophysiological domains, such as intracortical recordings.

7 Code and Data Availability

All code is available at https://anonymous.4open.science/r/HarmonICA-FB56/. Data available on reasonable request.

8 Acknowledgements

AKC was supported by the EPSRC Centre for Neurotechnology (EP/L016737/1) and Meta through the Imperial-META Wearable Neural Interfaces Research Centre. AG was supported by the EPSRC AI for Healthcare (EP/S023283/1). IMG was supported by the EPSRC Centre for Neurotechnology (EP/L016737/1) and a EPSRC Doctoral Prize Fellowship. PM was supported by a Eric and Wendy Schmidt AI in Science Postdoctoral Fellowship. SM was supported by HybridNeuro (HORIZON-WIDERA-2021-ACCESS-03 - 101079392) and Meta through the Imperial-META Wearable Neural Interfaces Research Centre. DF was supported by NaturalBionicS (ERC Synergy 810346) and NISNEM (EPSRC EP/T020970/1).

References

[1] S. Muceli, W. Poppendieck, A. Holobar, S. Gandevia, D. Liebetanz, and D. Farina, “Blind identification of the spinal cord output in humans with high-density electrode arrays implanted in muscles,” Science advances, vol. 8, no. 46, p. 5040, 2022.
[2] CTRL Labs at Reality Labs, D. Sussillo, P. Kaifosh, and T. Reardon, “A generic noninvasive neuromotor interface for human-computer interaction,” bioRxiv, pp. 2024–02, 2024.
[3] T. Kapelner, M. Sartori, F. Negro, and D. Farina, “Neuro-musculoskeletal map** for man-machine interfacing,” Scientific Reports, vol. 10, no. 1, p. 5834, 2020.
[4] J. Eden, M. Bräcklein, J. Ibáñez, D. Y. Barsakcioglu, G. Di Pino, D. Farina, E. Burdet, and C. Mehring, “Principles of human movement augmentation and the challenges in making it a reality,” Nature Communications, vol. 13, no. 1, p. 1345, 2022.
[5] G. Dominijanni, S. Shokur, G. Salvietti, S. Buehler, E. Palmerini, S. Rossi, F. De Vignemont, A. d’Avella, T. R. Makin, D. Prattichizzo, et al., “The neural resource allocation problem when enhancing human bodies with extra robotic limbs,” Nature Machine Intelligence, vol. 3, no. 10, pp. 850–860, 2021.
[6] M. Pachitariu, S. Sridhar, J. Pennington, and C. Stringer, “Spike sorting with kilosort4,” Nature Methods, pp. 1–8, 2024.
[7] S. Garcia, C. Windolf, J. Boussard, B. Dichter, A. P. Buccino, and P. Yger, “A modular implementation to handle and benchmark drift correction for high-density extracellular recordings,” Eneuro, vol. 11, no. 2, 2024.
[8] C. Chen, S. Ma, X. Sheng, D. Farina, and X. Zhu, “Adaptive real-time identification of motor unit discharges from non-stationary high-density surface electromyographic signals,” IEEE Transactions on Biomedical Engineering, vol. 67, no. 12, pp. 3501–3509, 2020.
[9] M. Kramberger and A. Holobar, “On the prediction of motor unit filter changes in blind source separation of high-density surface electromyograms during dynamic muscle contractions,” IEEE Access, vol. 9, pp. 103533–103540, 2021.
[10] C. Windolf, H. Yu, A. C. Paulk, D. Meszéna, W. Muñoz, J. Boussard, R. Hardstone, I. Caprara, M. Jamali, Y. Kfir, et al., “Dredge: robust motion correction for high-density extracellular recordings across species,” BioRxiv, 2023.
[11] M. Pachitariu, S. Sridhar, and C. Stringer, “Solving the spike sorting problem with kilosort,” BioRxiv, 2023.
[12] A. Hyvärinen and E. Oja, “Independent component analysis: algorithms and applications,” Neural networks, vol. 13, no. 4-5, pp. 411–430, 2000.
[13] I. Khemakhem, D. Kingma, R. Monti, and A. Hyvarinen, “Variational autoencoders and nonlinear ica: A unifying framework,” in International Conference on Artificial Intelligence and Statistics, pp. 2207–2217, PMLR, 2020.
[14] B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “Nerf: Representing scenes as neural radiance fields for view synthesis,” Communications of the ACM, vol. 65, no. 1, pp. 99–106, 2021.
[15] A. Holobar and D. Zazula, “Multichannel blind source separation using convolution kernel compensation,” IEEE Transactions on Signal Processing, vol. 55, no. 9, pp. 4487–4496, 2007.
[16] S. Muceli, W. Poppendieck, F. Negro, K. Yoshida, K. P. Hoffmann, J. E. Butler, S. C. Gandevia, and D. Farina, “Accurate and representative decoding of the neural drive to muscles in humans with multi-channel intramuscular thin-film electrodes,” The Journal of physiology, vol. 593, no. 17, pp. 3789–3804, 2015.
[17] K. C. McGill, Z. C. Lateva, and H. R. Marateb, “Emglab: an interactive emg decomposition program,” Journal of neuroscience methods, vol. 149, no. 2, pp. 121–133, 2005.
[18] S. Ma, I. Mendez Guerra, A. H. Caillet, J. Zhao, A. K. Clarke, K. Maksymenko, S. Deslauriers-Gauthier, X. Sheng, X. Zhu, and D. Farina, “NeuroMotion: Open-source simulator with neuromechanical and deep network models to generate surface EMG signals during voluntary movement,” BioRxiv, 2023.
[19] K. Maksymenko, A. K. Clarke, I. Mendez Guerra, S. Deslauriers-Gauthier, and D. Farina, “A myoelectric digital twin for fast and realistic modelling in deep learning,” Nat. Commun., vol. 14, no. 1, p. 1600, 2023.
[20] D. C. McFarland, B. I. Binder-Markey, J. A. Nichols, S. J. Wohlman, M. De Bruin, and W. M. Murray, “A musculoskeletal model of the hand and wrist capable of simulating functional tasks,” IEEE Trans. Biomed. Eng., vol. 70, no. 5, pp. 1424–1435, 2023.
[21] I. Mendez Guerra, D. Y. Barsakcioglu, and D. Farina, “Wearable neural interfaces: Real-time identification of motor neuron discharges in dynamic motor tasks,” BioRxiv, 2024.
[22] F. Negro, S. Muceli, A. M. Castronovo, A. Holobar, and D. Farina, “Multi-channel intramuscular and surface emg decomposition by convolutive blind source separation,” Journal of neural engineering, vol. 13, no. 2, p. 026027, 2016.