MindArm: Mechanized Intelligent Non-Invasive Neuro-Driven Prosthetic Arm System

Maha Nawaz1 and Abdul Basit2 and Muhammad Shafique3 1Maha Nawaz is enrolled in Dubai College and affiliated with eBRAIN lab, United Arab Emirates, [email protected]2Abdul Basit is with the Division of Engineering, New York University (NYU) Abu Dhabi, United Arab Emirates, [email protected]3Muhammad Shafque is the Director of eBRAIN Lab, Division of Engineering, New York University (NYU) Abu Dhabi, United Arab Emirates, [email protected]This work was supported by Rachmad Vidya W. Putra, New York University (NYU) Abu Dhabi, United Arab Emirates, [email protected]
Abstract

Currently, people with disability or difficulty to move their arms (referred to as “patients”) have very limited technological solutions to efficiently address their physiological limitations. It is mainly due to two reasons: (1) the non-invasive solutions like mind-controlled prosthetic devices are typically very costly and require expensive maintenance; and (2) other solutions require costly invasive brain surgery, which is high risk to perform, expensive, and difficult to maintain. Therefore, current technological solutions are not accessible for all patients with different financial backgrounds. Toward this, we propose a low-cost technological solution called MindArm, a mechanized intelligent non-invasive neuro-driven prosthetic arm system. Our MindArm system employs a deep neural network (DNN) engine to translate brain signals into the intended prosthetic arm motion, thereby hel** patients to perform many activities despite their physiological limitations. Here, our MindArm system utilizes widely accessible and low-cost surface electroencephalogram (EEG) electrodes coupled with an Open Brain Computer Interface and UDP networking for acquiring brain signals and transmitting them to the compute module for signal processing. In the compute module, we run a trained DNN model to interpret normalized micro-voltage of the brain signals, and then translate them into a prosthetic arm action via serial communication seamlessly. The experimental results on a fully working prototype demonstrate that, from the three defined actions, our MindArm system achieves positive success rates, i.e., 91% for idle/stationary, 85% for shake hand, and 84% for pick-up cup. This demonstrates that our MindArm provides a novel approach for an alternate low-cost mind-controlled prosthetic devices for all patients.

Index Terms:
EEG, Myoelectric, RPNI, BCI, Artefacts, Artificial Intelligence, Fast Fourier Transform.

I Introduction

In recent years, approximately 5.4 million people live with paralysis in the United States alone [1] and 57.7 million people live with limb amputations globally [2]; see the global distribution of age-standardized amputation rates in Fig. 1. Among them, there are people with disability or difficulty to move their arms, which we refer to as “patients” in this paper for brevity. These patients have very limited technological solutions for efficiently addressing their physiological limitations as current technological solutions are not easily accessible for all patients with different financial backgrounds, because the existing solutions are typically very costly. For instance, the non-invasive solutions such as mind-controlled prosthetic devices are typically very costly (i.e., over $100K), and hence expensive to maintain [3][4]. Other solutions require invasive brain surgery which is very expensive, high risk, and difficult to maintain, thereby hindering its widespread adoption [5][6]. Hence, there is a significant need for an alternative low-cost solution that help the patients to perform diverse activities.

Targeted Research Problem: How can we develop a low-cost solution that can help the patients to move their arms for performing desired actions? An efficient solution to this problem will help the patients from different financial backgrounds to access an alternate low-cost solution for addressing their physiological limitations (i.e., unable to move hands) and performing diverse activities, thereby improving their quality of life.

Refer to caption
Figure 1: Global Incidence of Traumatic Amputations. Based on the work by Yuan et al. [7] published in Front. Public Health, this map illustrates the age-standardized amputation rates across 204 countries and territories. Areas with higher ASIRs could benefit significantly from the MindArm prosthetic solution, which offers a cost-effective and accessible alternative to assist individuals who have experienced limb loss, thus addressing the global need for affordable prosthetic care.

I-A State-of-the-Art Solutions and Their Limitations

To address the physiological limitations of the patients, the existing solutions can be loosely classified into two categories: (1) non-invasive solutions, such as mind-controlled prosthetic devices; (2) and invasive solutions, such as brain-computer interface (BCI).

Non-invasive Solutions: These solutions do not require invasive devices inside the patients’ body to obtain high quality information signals. For instance, prosthetic-based solutions like “DEKA bionic arm” [8] use electromyography (EMG) signals to capture the muscle signals and then translate them into the desired action. Another non-invasive solutions (e.g., gesture control armband) focus on capturing and interpreting muscle signals without physical actuator attached [9], thereby making them lightweight. In such EMG-based solutions, when the patients contracts their muscles, the electrodes detect the muscle signals, then send them to the compute module for further processing [10] [11]. However, EMG signals may not be present in amputated patients, or cases where physical conditions are severely compromised (e.g., patients with severe disability, such as paraplegia and tetraplegia/quadriplegia). Furthermore, the other non-invasive solutions like mind-controlled prosthetic devices are typically very costly (i.e., over $100K) and hence expensive to maintain [3][4]. It is partially due to the effective yet expensive materials such as titanium alloys [12].

Invasive Solutions: To improve the quality of information signals, some solutions employ invasive devices to access the signal sources. Consequently, this device needs to be physically implanted inside the patients’ body through invasive surgery. However, such an approach is a high risk procedure to perform and typically very expensive. In addition, most of such technologies are not wireless, thereby making them difficult to maintain. For instance, to acquire electroencephalogram (EEG) waves, ones need to put electrodes in the patients’ head (skull), and attach the plug and connecting cables to the compute module. Furthermore, such invasive-based solutions are not commercially available, and it is estimated that when it becomes available, it may cost hundreds of thousands of dollars [13]. Hence, not all patients with different financial backgrounds can afford that.

In summary, the existing technological solutions are still very expensive as well as difficult to maintain. Moreover, in the invasive solution cases, there is a high risk to the patients’ body, which may cause other negative side effects.

Given the benefits and weaknesses of the state-of-the-art, we identify that the potential solution is to consider a non-invasive solution with low-cost technology, while ensuring high accuracy of signal processing that can correctly interpret the input signals into actions. To fulfill such requirements, we opt to develop a non-invasive EEG-based prosthetic arm.

I-B Scientific Challenges

Our non-invasive EEG-based prosthetic arm solution bears potentials to address the existing weaknesses in the state-of-the-art, but it also imposes several challenges, as discussed in the following.

  • To reduce the design cost, one of the main challenge is to design a system with an effective signal processing pipeline, that can be implemented using a low-cost off-the-shelf devices and modules.

  • There is environmental noise which pollutes the EEG signals. Such artefacts need to be removed, and thereby requiring an effective denoising process.

  • It requires an effective algorithm to learn and extract information from EEG signals. Once the system learns the EEG features, it should be able to correlate these features to the corresponding prosthetic arm action.

Refer to caption
Figure 2: Our MindArm Workflow: EEG electrodes capture the brains’ micro-voltage signals, which are then collected by the brain-computer interface (BCI) system with GUI. The data undergoes decomposition, cleaning, and temporary cloud storage before classification by the neural network. After training, the system predicts intended actions in real-time, and these actions are communicated to the prosthetic limb, enabling it to move accordingly.

I-C Our Novel Contributions

To address the targeted problem and scientific challenges, we propose MindArm, a mechanized intelligent non-invasive neuro-driven prosthetic arm system; see an overview in Fig. 2. It employs a deep neural network (DNN) engine to extract information from EEG signals for identifying the given instruction, then translate it into a prosthetic arm action. To achieve this, our MindArm system makes the following novel contributions.

  • Removing Noise in the EEG Signals (Section II-A): We reduce noise in the EEG signals by extracting band power from each channel with respective frequencies such that residual noise is filtered, as well as employing metal insulation to reduce eddy currents and domestic alternate current noise.

  • Learning EEG Signal Features using DNN Training (Section II-B): We employ DNN training to effectively learn EEG signal features that are obtained from a low-cost off-the-shelf Ganglion board. We employ a window buffer size to overcome the shortcomings of a small number of EEG channels on the Ganglion board.

  • Low-Cost Prosthetic Arm Design (Section II-C): We design the prosthetic arm structure in Fusion360, and build it using 3D printer and Prusa MK3 with a combination of ABS111ABS: Acrylonitrile Butadiene Styrene, PETG222PETG: Poly-Ethylene Terephthalate Glyco, and PLA333PLA: Poly-Lactic Acid filaments to provide a low weight yet strong structure. Lastly, the design of the prosthetic is modular, allowing all parts can be easily replaced if required, thereby reducing the maintenance cost.

Key Results: To evaluate our MindArm, we build a complete setup encompassing EEG acquisition module, compute module (i.e., DNN engine), and actuator module (e.g., servo motor and prosthetic arm), whose total cost is similar-to\sim$450.

Our fully functional prototype of the MindArm system demonstrates promising success rates in performing three predefined actions, showcasing the efficacy of MindArm as an affordable solution for a mind-controlled prosthetic arm.

Refer to caption
Figure 3: MindArm Methodology: (a) Data Collection: EEG voltages are transmitted via Bluetooth to the GUI, which processes the data through a Fast Fourier Transform (FFT). The decomposed data are then saved to the cloud for further processing. (b) Neural Network Training: The dataset undergoes cleaning and is categorized into three groups. To accommodate UDP communication latency, the sampling rate is adjusted to 40 Hz. A window size of 80 samples, including a feature set of 20, is selected for training the network, which is subsequently saved to the cloud. (c) Deployment: The trained network interfaces with the prosthetic, designed with servo motors to enable 3 Degrees of Freedom (3DoF) movement. Network outputs are converted into servo actuations to perform the intended tasks.

II Methodology

In this section we will describe the MindArm system in detail, along with the dataset generation, refinement and training workflow. We also elaborate the prosthetic design and label feedback selection along with system integration; see an overview in Fig. 3.

II-A EEG Data Collection & Extraction

The off-the-shelf state-of-the-art devices on the market for collecting EEG signals include the OpenBCI complete Ultracortex [14], Emotiv EPOC X, and Flex kit. However, in addition to the price of the Emotiv kits [15], they also require preparation of either saline soaked felt or gel coating further decreasing practical effectiveness compared to the dry electrodes utilized with the Ganglion brain computer interface [14]. Therefore, to maintain a low price for the prosthetic and maintain practicality of dry electrodes, the Ganglion board is utilized.

The OpenBCI Ganglion board features 4 EEG channels and 2 references, facilitating the use of both dry comb electrodes and flat electrodes with a snap connection interface coated in silver-silver chloride [16]. Flat electrodes are positioned at Fp1 and Fp2 locations in the nasion, and comb electrodes at T3 and T4, with references placed at A1 and A2. The placement at Fp1 and Fp2 aims to enhance alpha wave detection. Although alpha waves are more prominent in the occipital lobe, placement near the inion (back of the head) often encounters greater noise. Consequently, while O1 and O2 locations yield stronger alpha and theta wave signals, achieving stable amplitudes across different frequencies using Fast Fourier Transform (FFT) analysis proves more challenging. Evidently, a scientific challenge arises when attempting to interpret data from only 4 EEG channels compared to numerous EEG channels in non-invasive state of the art devices.

Refer to caption
Figure 4: The electrodes are attached to an adjustable band that has different slots for peripheral EEG readings.
Refer to caption
TABLE I: Comparison of EEG Devices
Device Name Price Channel/Sensors Requirements
Ganglion BCI $400 4 channels None
Emotiv EPOC X $800 14 sensors Saline/Gel
Emotiv Flex Kit $1700 32 sensors Saline/Gel
openBCI Ultracortex $2400 16 channels None
Refer to caption
Figure 5: EEG Signal Processing Overview. (a) Population-based neural network thresholds indicate average band power values, enhancing general algorithm accuracy at the expense of individual specificity. (b) Metric values, derived from these thresholds, facilitate relaxation state detection through FFT-based amplitude criteria across gamma, beta, alpha, theta, and delta brainwave bands.

In the initial phase of the study, rather than directly decomposing the EEG signal into specific frequency bands, a more rudimentary threshold-based system was employed to ascertain the subject’s state of relaxation or concentration. The determination of the subject being relaxed or focused was then mapped to corresponding actions to be executed by the prosthetic device.

The results indicate promising trends; however, the current method presents a notable limitation: the states of concentration and relaxation are not distinctly categorized by Graphical User Interface (GUI). Consequently, the algorithm fails to differentiate between ‘relaxed’ and ‘concentrated’ states, only recognizing ‘relaxed and stationary’ as well as ‘concentrated and stationary’ states. This issue stems from an observable overlap in the metric thresholds for relaxation and concentration within the GUI. Therefore, it is imperative to develop an alternative approach that enables accurate prediction across all three desired states: handshaking, cup gras**, and remaining stationary. Notably, a parallel in amplitude across various frequencies is observed when comparing the thought of handshaking with the relaxed state, and the thought of cup gras** with the concentrated state.

Refer to caption
Refer to caption
Figure 6: Correlation between metric threshold and prediction success rates. Panel (a) Illustrates a positive relationship where an increased metric threshold is associated with a higher number of successful handshaking predictions out of 20 attempts. (b) shows a generally positive trend in the number of successful predictions for cup gras** as the metric threshold rises; however, a notable decrease in successful predictions is observed at a threshold of 0.9, under-performing the success rate at the threshold of 0.8.

To enhance the accuracy of state prediction, the proposed methodology involves adopting a strategy of associating relaxation with the intent to shake hands and concentration with the intent to pick up a cup.

To validate the efficacy of this approach, a novel signal processing technique is employed, which decomposes EEG signals into five distinct brain wave categories: delta, theta, alpha, beta, and gamma, each within their characteristic frequency bands: delta (1–4 Hz), theta (4–8 Hz), alpha (8–13 Hz, with sub-bands alpha-1 8–10 Hz and alpha-2 11–13 Hz), beta (13–30 Hz), and gamma (above 30 Hz). This spectral decomposition allows for more precise extraction of features from each channel, thereby reducing noise transmission through the User Datagram Protocol (UDP) network and enhancing algorithmic accuracy. The choice of UDP is driven by its expedited data transfer capabilities, absence of connection establishment procedures, and greater efficiency via lower bandwidth usage and overhead, thus compensating for potential latency and ensuring a robust set of features for the neural network classifier, contributing to a low-maintenance system [17].

Refer to caption
Figure 7: Characteristic Brain Wave Patterns. Gamma waves exhibit peaks during problem-solving and deep concentration. Beta waves are pronounced with active mental engagement. Alpha waves are elevated during periods of relaxation and reflection. Theta waves increase with drowsiness, while delta waves are predominant in deep sleep.
Refer to caption
Figure 8: Similarity in Amplitude Spikes Across Frequencies. The Fast Fourier Transform (FFT) graphs display coinciding amplitude peaks at corresponding frequencies (a), indicating analogous band power characteristics among the channels.
Refer to caption
Figure 9: The graphs illustrate the frequency amplitudes across the spectrum. Notably, the amplitude pattern associated with a concentrated state closely matches that observed during the action of picking up a cup, suggesting that this action is predominantly executed in a state of concentration.

Moreover, the system capitalizes on the ganglion board’s capability to process data at a sampling rate of 200Hz by transmitting only the most significant bit from each EEG sample [18]. This transmission approach results in slightly lower sample resolution, which is considered a negligible trade-off. Consequently, the neural network receives a substantial, numerically labeled dataset every second for each classification category. This enhances the learning rate of the prosthetic actions, offering a cost-effective and faster training methodology when compared to other BCI devices within the same price bracket of $400, such as the ganglion.

Refer to caption
Figure 10: EEG Signal Cleaning Process: Illustrates the application of novel techniques for the removal of major artifacts, resulting in the accurate representation of amplitudes at specified frequencies.

A significant impediment encountered with non-invasive mind-controlled prosthetics is environmental electrical noise interference at the user’s location. To mitigate this, data was sampled across various settings with differing levels of domestic alternating current (AC). Areas with minimal AC interference exhibited reduced noise in the EEG channels. Despite this, the reduction was not adequate to yield clean channel features suitable for neural network input. Moreover, this approach does not align with the practical need for portability, as the prosthetic should function optimally in diverse locations irrespective of AC noise levels. The results of this filtering process are shown in Fig. 10.

With artifacts minimized, it is essential to transmit the band power data to the Python IDE in real-time. This transmission is facilitated through UDP networking, enabling data transfer via a designated port and socket. The maximum buffer size is set to 1024 bytes, which sufficiently accommodates the data payload. For each sample frame per second, the data is converted from binary to decimal format, flattened, and then written to a designated file, where all the data within that file is categorized under the same class.

Refer to caption
Figure 11: Comparison of TCP and UDP Protocols. TCP initiates communication with a three-way handshake, starting with a synchronize (SYN) message from the client (a), followed by a synchronize-acknowledgment (SYN-ACK) from the server, and concluding with an acknowledgment (ACK) from the client (b), thereby establishing a connection. In contrast, UDP operates without a handshake, allowing data to be sent and received without establishing a persistent connection.
UDP Networking and Data Transmission # Python code for UDP networking UDP_IP = ”127.0.0.1” UDP_PORT = 4005 sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) sock.bind((UDP_IP, UDP_PORT)) # Data is sent as a JSON message of the form: [ [delta_ch_1, theta_ch_1, alpha_ch_1, beta_ch_1, gamma_ch_1], [delta_ch_2, theta_ch_2, alpha_ch_2, beta_ch_2, gamma_ch_2], [delta_ch_3, theta_ch_3, alpha_ch_3, beta_ch_3, gamma_ch_3], [delta_ch_4, theta_ch_4, alpha_ch_4, beta_ch_4, gamma_ch_4], ] # Example received message with the IDE: received message: b’{”type”:”bandPower”,”data”:[Array[5x4]]}’

The flattening process is represented as: flattened_data = [i] + [item for sublist in data_list for item in sublist]

Here, ‘i’ represents the input number, facilitating continuous tracking of the dataset size.

II-B Neural Network Training

Currently, the algorithm processes three datasets, each corresponding to a distinct action executed by the prosthetic:

  • Shaking hands with an individual.

  • Remaining stationary or in a resting position.

  • Picking up a cup.

Refer to caption
Figure 12: Cleaned data from the files are fed into the neural network, which then undergoes its training process. During this process, the labels of the inputs are compared with the neural network’s predicted results, facilitating gradient descent optimization via the loss function. This optimization adjusts the network’s parameters to improve performance.

During the data collection phase, the user is instructed to concentrate on one of three predetermined actions for a fixed duration. Concurrently, the acquired data undergoes cleaning procedures before being stored in a CSV file, with each entry tagged with a numerical label corresponding to the envisioned action. For instance, data associated with the act of shaking hands is recorded in ‘shakehands.csv’ and labeled as ‘0’. Although the intended sampling rate is 200Hz, practical limitations due to latency in data processing result in an effective rate of approximately 50Hz. This discrepancy necessitates an extended duration for the training phase. The resultant datasets comprise between 10,000 and 20,000 sampled pieces, employed for training the neural network. The data structure includes 20 columns, reflecting the aggregation of five normalized brainwave metrics across four distinct EEG channels.

Algorithm 1 Data preprocessing and dynamic segmentation into windows with label assignment
1:Input: Datasets dfpickUpCup𝑑subscript𝑓𝑝𝑖𝑐𝑘𝑈𝑝𝐶𝑢𝑝df_{pickUpCup}italic_d italic_f start_POSTSUBSCRIPT italic_p italic_i italic_c italic_k italic_U italic_p italic_C italic_u italic_p end_POSTSUBSCRIPT, dfshakeHands𝑑subscript𝑓𝑠𝑎𝑘𝑒𝐻𝑎𝑛𝑑𝑠df_{shakeHands}italic_d italic_f start_POSTSUBSCRIPT italic_s italic_h italic_a italic_k italic_e italic_H italic_a italic_n italic_d italic_s end_POSTSUBSCRIPT, dfstayIdle𝑑subscript𝑓𝑠𝑡𝑎𝑦𝐼𝑑𝑙𝑒df_{stayIdle}italic_d italic_f start_POSTSUBSCRIPT italic_s italic_t italic_a italic_y italic_I italic_d italic_l italic_e end_POSTSUBSCRIPT with corresponding labels 0, 1, 2
2:Output: Windowed features Xwindowedsubscript𝑋𝑤𝑖𝑛𝑑𝑜𝑤𝑒𝑑X_{windowed}italic_X start_POSTSUBSCRIPT italic_w italic_i italic_n italic_d italic_o italic_w italic_e italic_d end_POSTSUBSCRIPT, One-hot encoded labels ywindowedsubscript𝑦𝑤𝑖𝑛𝑑𝑜𝑤𝑒𝑑y_{windowed}italic_y start_POSTSUBSCRIPT italic_w italic_i italic_n italic_d italic_o italic_w italic_e italic_d end_POSTSUBSCRIPT
3:procedure SegmentDataset(X,y,win_size,num_chan𝑋𝑦𝑤𝑖𝑛_𝑠𝑖𝑧𝑒𝑛𝑢𝑚_𝑐𝑎𝑛X,y,win\_size,num\_chanitalic_X , italic_y , italic_w italic_i italic_n _ italic_s italic_i italic_z italic_e , italic_n italic_u italic_m _ italic_c italic_h italic_a italic_n)
4:    max_overlapmaxwin_size/2𝑚𝑎𝑥_𝑜𝑣𝑒𝑟𝑙𝑎𝑝𝑚𝑎𝑥𝑤𝑖𝑛_𝑠𝑖𝑧𝑒2max\_overlap\leftarrow maxwin\_size/2italic_m italic_a italic_x _ italic_o italic_v italic_e italic_r italic_l italic_a italic_p ← italic_m italic_a italic_x italic_w italic_i italic_n _ italic_s italic_i italic_z italic_e / 2
5:    min_overlapmin(20,win_size/4)𝑚𝑖𝑛_𝑜𝑣𝑒𝑟𝑙𝑎𝑝𝑚𝑖𝑛20𝑤𝑖𝑛_𝑠𝑖𝑧𝑒4min\_overlap\leftarrow min(20,win\_size/4)italic_m italic_i italic_n _ italic_o italic_v italic_e italic_r italic_l italic_a italic_p ← italic_m italic_i italic_n ( 20 , italic_w italic_i italic_n _ italic_s italic_i italic_z italic_e / 4 )
6:    Initialize segmented_X[]𝑠𝑒𝑔𝑚𝑒𝑛𝑡𝑒𝑑_𝑋segmented\_X\leftarrow[]italic_s italic_e italic_g italic_m italic_e italic_n italic_t italic_e italic_d _ italic_X ← [ ], segmented_y[]𝑠𝑒𝑔𝑚𝑒𝑛𝑡𝑒𝑑_𝑦segmented\_y\leftarrow[]italic_s italic_e italic_g italic_m italic_e italic_n italic_t italic_e italic_d _ italic_y ← [ ]
7:    start_idx0𝑠𝑡𝑎𝑟𝑡_𝑖𝑑𝑥0start\_idx\leftarrow 0italic_s italic_t italic_a italic_r italic_t _ italic_i italic_d italic_x ← 0
8:    while start_idx+win_sizeX.shape[0]formulae-sequence𝑠𝑡𝑎𝑟𝑡_𝑖𝑑𝑥𝑤𝑖𝑛_𝑠𝑖𝑧𝑒𝑋𝑠𝑎𝑝𝑒delimited-[]0start\_idx+win\_size\leq X.shape[0]italic_s italic_t italic_a italic_r italic_t _ italic_i italic_d italic_x + italic_w italic_i italic_n _ italic_s italic_i italic_z italic_e ≤ italic_X . italic_s italic_h italic_a italic_p italic_e [ 0 ] do
9:         overlap𝑜𝑣𝑒𝑟𝑙𝑎𝑝absentoverlap\leftarrowitalic_o italic_v italic_e italic_r italic_l italic_a italic_p ← random integer between min_overlap𝑚𝑖𝑛_𝑜𝑣𝑒𝑟𝑙𝑎𝑝min\_overlapitalic_m italic_i italic_n _ italic_o italic_v italic_e italic_r italic_l italic_a italic_p and max_overlap𝑚𝑎𝑥_𝑜𝑣𝑒𝑟𝑙𝑎𝑝max\_overlapitalic_m italic_a italic_x _ italic_o italic_v italic_e italic_r italic_l italic_a italic_p
10:         step_sizewin_sizeoverlap𝑠𝑡𝑒𝑝_𝑠𝑖𝑧𝑒𝑤𝑖𝑛_𝑠𝑖𝑧𝑒𝑜𝑣𝑒𝑟𝑙𝑎𝑝step\_size\leftarrow win\_size-overlapitalic_s italic_t italic_e italic_p _ italic_s italic_i italic_z italic_e ← italic_w italic_i italic_n _ italic_s italic_i italic_z italic_e - italic_o italic_v italic_e italic_r italic_l italic_a italic_p
11:         end_idxstart_idx+win_size𝑒𝑛𝑑_𝑖𝑑𝑥𝑠𝑡𝑎𝑟𝑡_𝑖𝑑𝑥𝑤𝑖𝑛_𝑠𝑖𝑧𝑒end\_idx\leftarrow start\_idx+win\_sizeitalic_e italic_n italic_d _ italic_i italic_d italic_x ← italic_s italic_t italic_a italic_r italic_t _ italic_i italic_d italic_x + italic_w italic_i italic_n _ italic_s italic_i italic_z italic_e
12:         segmentX[start_idx:end_idx,:]segment\leftarrow X[start\_idx:end\_idx,:]italic_s italic_e italic_g italic_m italic_e italic_n italic_t ← italic_X [ italic_s italic_t italic_a italic_r italic_t _ italic_i italic_d italic_x : italic_e italic_n italic_d _ italic_i italic_d italic_x , : ]
13:         segmentsegment.reshape(win_size,num_chan)formulae-sequence𝑠𝑒𝑔𝑚𝑒𝑛𝑡𝑠𝑒𝑔𝑚𝑒𝑛𝑡𝑟𝑒𝑠𝑎𝑝𝑒𝑤𝑖𝑛_𝑠𝑖𝑧𝑒𝑛𝑢𝑚_𝑐𝑎𝑛segment\leftarrow segment.reshape(win\_size,num\_chan)italic_s italic_e italic_g italic_m italic_e italic_n italic_t ← italic_s italic_e italic_g italic_m italic_e italic_n italic_t . italic_r italic_e italic_s italic_h italic_a italic_p italic_e ( italic_w italic_i italic_n _ italic_s italic_i italic_z italic_e , italic_n italic_u italic_m _ italic_c italic_h italic_a italic_n )
14:         Append segment𝑠𝑒𝑔𝑚𝑒𝑛𝑡segmentitalic_s italic_e italic_g italic_m italic_e italic_n italic_t to segmented_X𝑠𝑒𝑔𝑚𝑒𝑛𝑡𝑒𝑑_𝑋segmented\_Xitalic_s italic_e italic_g italic_m italic_e italic_n italic_t italic_e italic_d _ italic_X
15:         Append y[start_idx]𝑦delimited-[]𝑠𝑡𝑎𝑟𝑡_𝑖𝑑𝑥y[start\_idx]italic_y [ italic_s italic_t italic_a italic_r italic_t _ italic_i italic_d italic_x ] to segmented_y𝑠𝑒𝑔𝑚𝑒𝑛𝑡𝑒𝑑_𝑦segmented\_yitalic_s italic_e italic_g italic_m italic_e italic_n italic_t italic_e italic_d _ italic_y
16:         start_idxstart_idx+step_size𝑠𝑡𝑎𝑟𝑡_𝑖𝑑𝑥𝑠𝑡𝑎𝑟𝑡_𝑖𝑑𝑥𝑠𝑡𝑒𝑝_𝑠𝑖𝑧𝑒start\_idx\leftarrow start\_idx+step\_sizeitalic_s italic_t italic_a italic_r italic_t _ italic_i italic_d italic_x ← italic_s italic_t italic_a italic_r italic_t _ italic_i italic_d italic_x + italic_s italic_t italic_e italic_p _ italic_s italic_i italic_z italic_e
17:    end while
18:    return segmented_X𝑠𝑒𝑔𝑚𝑒𝑛𝑡𝑒𝑑_𝑋segmented\_Xitalic_s italic_e italic_g italic_m italic_e italic_n italic_t italic_e italic_d _ italic_X, segmented_y𝑠𝑒𝑔𝑚𝑒𝑛𝑡𝑒𝑑_𝑦segmented\_yitalic_s italic_e italic_g italic_m italic_e italic_n italic_t italic_e italic_d _ italic_y
19:end procedure
20:Combine and preprocess datasets
21:Assign labels and standardize features
22:Dynamically segment features into windows with random overlap and assign labels
23:Combine windowed data from all actions
24:Split data into training and test sets
25:Initialize EEGDataset𝐸𝐸𝐺𝐷𝑎𝑡𝑎𝑠𝑒𝑡EEGDatasetitalic_E italic_E italic_G italic_D italic_a italic_t italic_a italic_s italic_e italic_t with features and labels
26:Create DataLoaders𝐷𝑎𝑡𝑎𝐿𝑜𝑎𝑑𝑒𝑟𝑠DataLoadersitalic_D italic_a italic_t italic_a italic_L italic_o italic_a italic_d italic_e italic_r italic_s for training and test sets

In the exploration of optimal neural network architectures for our dataset, a diverse range of models was assessed. These included simple Feedforward Neural Networks (FFNNs), Recurrent Neural Networks (RNNs), Long Short-Term Memory networks (LSTMs), CNN-LSTM hybrids, and Transformer-based networks. Among these, the Transformer-based network outperformed other models achieving an impressive validation accuracy of 97.1%. This superior performance can be attributed to the Transformer’s ability to process sequential data in parallel and its efficient handling of long-range dependencies, which are critical for understanding complex patterns within the EEG dataset. However, the trade-off for this high level of accuracy involves increased computational resources and training time, compared to simpler models like FFNNs or RNNs. The decision to employ a Transformer-based model thus reflects a strategic balance between seeking optimal performance and managing the computational costs associated with more sophisticated architectures. The transformer architecture is illustrated in Fig. 14. The training accuracy of these networks are depicted in Fig. 15 showcasing the comparative training performance across epochs for the various network architectures.

Refer to caption
Figure 13: EEGTransformer Architecture. The depicted Transformer-based model processes segmented EEG data sequences, leveraging a multi-head self-attention mechanism and subsequent layer normalizations within the Transformer Encoder. Data flows through an average pooling layer before passing through two feed-forward networks, culminating in the classification of the prosthetic arm’s actions. These predicted actions are then communicated to the prosthetic arm via serial communication for execution.
Refer to caption
Figure 14: Comparative Training Performance of Neural Network Architectures. This graph presents the training accuracies over numerous epochs for each network model tested. It highlights the Transformer-based network’s superior performance in accuracy relative other networks, underlining its robust capability in learning from the EEG dataset.

Initial training sessions revealed challenges with label volatility, as the model output action labels at a rate of approximately 40 times per second over UDP. This high frequency of label generation led to instances where transient thought patterns inadvertently triggered unintended actions. For example, a brief, unintended contemplation of an action could result in the erroneous activation of the prosthetic arm.

To address this, the model’s training and input collection were adapted to include a larger window size, enhancing data stability and output accuracy. It is critical for the input dataset to accurately reflect sustained thought patterns associated with specific actions, which typically last more than a fortieth of a second. In practice, a thought duration of at least 2 seconds is necessary for consistent brain wave intensity.

Refer to caption
Figure 15: Impact of Window Size on Transformer Network Accuracy. This bar graph illustrates the relationship between various window sizes and the corresponding accuracy of the model. Each bar represents a different window size which is dynamically overlapped. The highest accuracy marked by a green bar underscores the optimal window size that maximizes accuracy, highlighting the importance of window size selection in neural network performance.

Accordingly, an optimized window size of 100 was implemented. This setup accumulates 100 rows of CSV data—each row representing a 20-dimensional vector from the EEG—into a single tensor. This tensor is then reshaped into a 1x2000 matrix (20x100), serving as the input for the neural network. This approach ensures that the input data effectively represents approximately two seconds of EEG data, allowing for more accurate and representative model outputs as live EEG data is streamed.

Refer to caption
Figure 16: EEG Data Sample Preparation. Each EEG data sample, comprising 20 features, is incorporated into a dynamically sized window, accumulating multiple samples for neural network input. This window size of data will be inputted into the neural network as a single input with each ‘block’ being an individual feature utilized for the neural network training.

This approach necessitates that the neural network receives a singular, consolidated input for processing. Consequently, the total number of potential inputs from each file was determined, followed by the resha** of matrices to conform to the neural network’s specified input dimensions

TABLE II: Dataset Dimensions Before and After Processing
Action Original Dimension Windowed Dimension
Shake Hands [13697,20]1369720[13697,20][ 13697 , 20 ] [136,100,20]13610020[136,100,20][ 136 , 100 , 20 ]
Stay Stationary [19622,20]1962220[19622,20][ 19622 , 20 ] [196,100,20]19610020[196,100,20][ 196 , 100 , 20 ]
Pick Up Cup [20837,20]2083720[20837,20][ 20837 , 20 ] [208,100,20]20810020[208,100,20][ 208 , 100 , 20 ]

With the refactored algorithm, an output of the intended user action is transmitted to the Arduino during inference every 2 seconds. This decreases volatility of the actions as a substantial greater number of features is taking into consideration for a sustained time period, thereby enhancing the robustness of the model.

Additionally, the neural network undergoes pre-training prior to its deployment and is subsequently stored on a cloud service. This strategy ensures that the duration for outputting a label is shorter than the interval between consecutive inputs, thereby reducing the potential for data loss during the wireless EEG streaming process.

II-C Prosthetic Design and Label Feedback

The prosthetic control system transmits the output label to the Arduino via serial communication every 2 seconds. At a standardized baud rate, the Arduino interprets the received number ‘0, 1, or 2’ and initiates the corresponding action. To prevent potential damage to the servos and artificial tendons due to rapid oscillation between positions, the algorithm is designed to pause reading incoming data until the current action is fully executed.

Refer to caption
Figure 17: The prosthetic was designed and modeled in Fusion360 to the precise dimensions of the author’s hand, incorporating 5 tendon tubes for threading braided fishing line, as illustrated in the adjacent figure.

Initially, the prosthetic’s actuation mechanism relied on the contraction and relaxation of tendons, facilitated by a servo horn. However, this design was found impractical due to the persistent friction between the tendons, made of braided fishing line, and the joint pins. Consequently, this friction led to the tendons’ degradation over time, resulting in both wear and tear of the prosthetic components and a decline in performance.

Refer to caption
Figure 18: The second prototype introduces elastic bungee cords to return the finger to its original position, replacing the previous mechanism of coupled flexion and extension via tendons. The use of bungee cords minimizes friction during stretch and contraction, unlike the movement of braided fishing lines, thereby enhancing the prosthetic’s durability

20 kg torque servos are utilized at the elbow joint to ensure the load borne by the prosthetic is adequately supported, thereby guaranteeing system durability and practicality. A modular design was developed for each servo compartment, allowing users to easily replace servos without the need for specialized tools.

Refer to caption
Figure 19: Modular Elbow Servo Joint. This design feature enhances the prosthetic’s maintainability, allowing for easier repairs and upgrades.

In the current market, commercially available realistic prosthetic gloves, exemplified by products from companies like Ottobock, typically start at a price point exceeding 250 USD. In contrast, this study utilized a silicone mold technique to fabricate a comparable realistic prosthetic glove at a material cost of just 15 USD. This substantial reduction in cost represents a significant stride toward democratizing access to prosthetic technology, markedly lowering the financial barrier for potential users.

Refer to caption
Figure 20: Demonstration of the prosthetic functionality. The left panel depicts the user concentrating on the action of picking up a cup, prompting the prosthetic to mimic the gesture in the center frame. The right panel shows the user contemplating a handshake, which the prosthetic executes accordingly.

III Results and Discussion

III-A Experimental Setup

The experimental setup for the prosthetic system comprises an array of specialized tools and technologies. The neural signal acquisition is managed using the Ganglion board and accompanying GUI from OpenBCI. Data communication is facilitated via UDP networking. For the development and training of the neural network, the PyTorch framework is employed. The system’s commands are transmitted through serial communication with the Arduino C++ IDE. The design and prototy** phase utilizes Tinkercad and Fusion360 for 3D modeling, alongside custom settings in Prusa Slicer for 3D printing and circuit schematic refinement.

III-B Analysis

The Transformer network’s was analysed by interfacing the model output with an Arduino board using serial communication protocols. The Arduino was programmed to translate the neural network’s output into actionable commands for the prosthetic hand. Each predicted action from the network triggered the corresponding movement in the prosthetic hand, showcasing the potential of this system in real-world applications. The deployment of this system demonstrated not only the high accuracy of the Transformer network, as reflected in the classification report and the confusion matrix but also its capability to operate in real time with the physical hardware, offering a seamless transition from prediction to action execution.

Refer to caption
Refer to caption
Figure 21: The confusion matrix shows actual and predicted actions, showing the number of instances for each action for the Test Dataset. The matrix highlights the model’s ability to differentiate between the prosthetic’s actions with a high degree of accuracy on Test Dataset, The pie chart shows the real world test cases and it showcases good accuracy in these tasks.
TABLE III: Classification Report
Class Precision Recall F1-Score
pickUpCup 0.86 0.82 0.84
shakeHands 0.83 0.86 0.85
stayStationary 0.90 0.92 0.91
Accuracy 0.86
Macro Avg 0.86 0.87 0.86
Weighted Avg 0.86 0.86 0.86

IV Conclusion

In this paper, we present a MindArm methodology to realize a low-cost mind-controlled prosthetic arm solution for people of determination, so that they can move their hands for doing activities. It translates brain signals into the intended arm motion by utilizing EEG technologies and employing DNN model to interpret brain signals into a prosthetic arm action. The experimental results show that, our MindArm system achieves positive success rates in three different actions, i.e., 90% for idle/stationary, 80% for shake hand, and 80% for pick-up cup. This demonstrates that our MindArm provides a novel approach for an alternate low-cost mind-controlled prosthetic devices for all people.

References

  • [1] B. Armour, E. Courtney-Long, M. Fox, H. Fredine, and A. Cahill, “Prevalence and causes of paralysis-united states, 2013,” American journal of public health, vol. 106, pp. e1–e3, 08 2016.
  • [2] C. McDonald, S. Westcott-McCoy, M. Weaver, J. Haagsma, and D. Kartin, “Global prevalence of traumatic non-fatal limb amputation,” Prosthetics and Orthotics International, vol. Publish Ahead of Print, 12 2020.
  • [3] E. Kwek and M. Choi, “Is a prosthetic arm customized prada? a critical perspective on the social aspects of prosthetic arms,” Disability & Society, vol. 31, pp. 1144 – 1147, 2016. [Online]. Available: https://api.semanticscholar.org/CorpusID:152081983
  • [4] T. Beyrouthy, S. A. Kork, J. A. Korbane, and A. Abdulmonem, “Eeg mind controlled smart prosthetic arm,” 2016 IEEE International Conference on Emerging Technologies and Innovative Business Practices for the Transformation of Societies (EmergiTech), pp. 404–409, 2016. [Online]. Available: https://api.semanticscholar.org/CorpusID:34212152
  • [5] L. H. B. Huinink, H. Bouwsema, D. H. Plettenburg, C. K. van der Sluis, and R. M. Bongers, “Learning to use a body-powered prosthesis: changes in functionality and kinematics,” Journal of NeuroEngineering and Rehabilitation, vol. 13, 2016. [Online]. Available: https://api.semanticscholar.org/CorpusID:1815633
  • [6] J. Collinger, S. Foldes, T. Bruns, B. Wodlinger, R. Gaunt, and D. Weber, “Neuroprosthetic technology for individuals with spinal cord injury,” The journal of spinal cord medicine, vol. 36, pp. 258–272, 07 2013.
  • [7] B. Yuan, D. Hu, S. Gu, S. Xiao, and F. Song, “The global burden of traumatic amputation in 204 countries and territories,” Frontiers in Public Health, vol. 11, 2023. [Online]. Available: https://www.frontiersin.org/journals/public-health/articles/10.3389/fpubh.2023.1258853
  • [8] C. Bloomer and K. L. Kontson, “Comparison of deka arm and body-powered upper limb prosthesis joint kinematics,” Archives of Rehabilitation Research and Clinical Translation, vol. 2, no. 3, p. 100057, 2020.
  • [9] S. M. M. Rahman, H. Mattila, M. Janka, and J. Virkki, “Impedance evaluation of textile electrodes for eeg measurements,” Textile Research Journal, vol. 93, no. 7-8, pp. 1878–1888, 2023. [Online]. Available: https://doi.org/10.1177/00405175221135131
  • [10] K. R. K. Kadir A. Yildiz, Alexander Y. Shin, “Interfaces with the peripheral nervous system for the control of a neuroprosthetic limb: a review,” Journal of NeuroEngineering and Rehabilitation, vol. Publish Ahead of Print, March 2020.
  • [11] P. Visconti, F. Gaetani, G. Zappatore, and P. Primiceri, “Technical features and functionalities of myo armband: An overview on related literature and advanced applications of myoelectric armbands mainly focused on arm prostheses,” International Journal on Smart Sensing and Intelligent Systems, vol. 11, pp. 1–25, 06 2018.
  • [12] M. Sarraf, E. Rezvani Ghomi, S. Alipour, S. Ramakrishna, and N. Liana Sukiman, “A state-of-the-art review of the fabrication and characteristics of titanium and its alloys for biomedical applications,” Bio-design and Manufacturing, pp. 1–25, 2021.
  • [13] D. R. Sandra V A, “Brain gate technology,” International Journal of Engineering Research & Technology (IJERT), vol. Publish Ahead of Print, 2015.
  • [14] OPENBCI. (2024) The complete ultracortex. [Online]. Available: https://shop.openbci.com/products/the-complete-headset-eeg
  • [15] Emotiv. (2024) Epoc. [Online]. Available: https://www.emotiv.com/epoc
  • [16] OPENBCI. (2024) Ganglion board (4-channels). [Online]. Available: https://shop.openbci.com/products/ganglion-board
  • [17] A. Roshdy, S. Al Kork, S. Said, and T. Beyrouthy, “A wearable exoskeleton rehabilitation device for paralysis-a comprehensive study,” vol. 4, pp. 17–26, 01 2019.
  • [18] OpenBCI. (2024) Ganglion data format. [Online]. Available: https://docs.openbci.com/Ganglion/GanglionDataFormat/