Page 125 - AN-4-2
P. 125
Advanced Neurology ML for EEG signal recognition
overfit when applied to high-dimensional datasets, such as
EEG signals, without sufficient regularization. 25
DNNs showed moderate performance, with lower
classification accuracy and AUC compared to CNNs and
LightGBM. The learning curves exhibited overfitting, as
the model achieved near-perfect training accuracy but
struggled to generalize to validation and test datasets.
26
DNNs lack the inherent ability to capture spatial or
temporal dependencies in the data, which are critical for
EEG signal analysis. Without convolutional layers, DNNs
are less equipped to handle the intricate patterns required
for this task, limiting their effectiveness in epilepsy
screening.
While LightGBM exhibited strong initial validation
performance, its lack of convergence on unseen data,
evidenced by the disparity between training and
Figure 5. Comparison of ROC curves for CNN and DNN models on
testing and validation datasets. The CNN model achieved an AUC of test accuracies, highlights inherent challenges with
0.86 on the test set and 0.87 on the validation set, while the DNN model overfitting in gradient-boosting frameworks applied to
achieved an AUC of 0.84 on the test set and 0.82 on the validation set. high-dimensional EEG data. Potential remedies such as
The ROC curves illustrate the trade-off between the true positive rate enhanced regularization (e.g., L1/L2 constraints, reduced
(sensitivity) and the false positive rate, with the diagonal line representing tree depth) or feature dimensionality reduction could
random performance.
Abbreviations: AUC: Area under the curve; CNN: Convolutional neural mitigate these issues. Similarly, the underperformance
network; DNN: Dense Neural Network; ROC: Receiver operating of DNNs relative to CNNs suggests that architectural
characteristic. refinements (e.g., integrating convolutional layers or
attention mechanisms) might improve their ability to
accuracy. If the differences between validation and testing model spatiotemporal EEG dynamics. However, a central
AUCs were large, it would indicate overfitting. In such cases, tenet of ML is identifying and optimizing the model
the model might capture noise or dataset-specific patterns class best suited to the problem’s intrinsic structure. The
rather than generalizable features, leading to unreliable CNN’s superior performance, stemming from its innate
predictions in real-world clinical settings. This could capacity to hierarchically extract localized spectral and
result in missed seizures or false alarms, undermining the temporal features without manual engineering, validates
model’s utility in practice. its prioritization for epilepsy-related EEG analysis.
Accordingly, while theoretical avenues exist to improve
4. Discussion LightGBM and DNNs, CNN’s clinically aligned accuracy,
This study evaluated the performance of three machine generalizability, and convergence behavior justify its focus
learning models – Light Gradient Boosting Machine as the foundation for translational tool development.
(LightGBM), DNNs, and CNNs – in classifying epileptic CNNs demonstrated the best overall performance,
and non-epileptic EEG signals. Each model demonstrated achieving higher accuracy, AUC, and F1 scores compared
distinct strengths and limitations, with CNNs emerging as to DNNs and LightGBM. The confusion matrices for
the most robust for this task. both validation and testing datasets confirmed a balanced
Among the PyCaret models, LightGBM achieved classification of epileptic and non-epileptic cases, with
the highest overall accuracy (85.9%) and AUC (0.91). Its lower rates of misclassification. CNNs’ ability to process
confusion matrices for both the validation and test datasets spatial and temporal features makes them particularly well-
27
demonstrated a low rate of false positives and negatives, suited for non-stationary signals like EEG data. Their
indicating strong precision and recall. However, convolutional layers extract hierarchical features, capturing
22
the learning curve revealed a persistent gap between both local patterns and broader dependencies, which are
training and validation performance, indicating a lack crucial for detecting epilepsy. In addition, the CNN learning
of convergence. This suggests that while LightGBM fits curves showed a better convergence trend than LightGBM,
23
the training data well, it may not generalize effectively to reflecting its ability to generalize more effectively.
unseen data. Gradient boosting methods like LightGBM A critical consideration for the clinical deployment
24
are known for their efficiency in structured data but can of CNN-based EEG analysis is model interpretability.
Volume 4 Issue 2 (2025) 119 doi: 10.36922/an.7941

