Page 125 - AN-4-2
P. 125

Advanced Neurology                                                          ML for EEG signal recognition



                                                               overfit when applied to high-dimensional datasets, such as
                                                               EEG signals, without sufficient regularization. 25
                                                                 DNNs showed moderate performance, with lower
                                                               classification accuracy and AUC compared to CNNs and
                                                               LightGBM. The learning curves exhibited overfitting, as
                                                               the  model achieved  near-perfect training accuracy but
                                                               struggled to generalize to validation and test datasets.
                                                                                                            26
                                                               DNNs  lack  the  inherent  ability  to capture spatial  or
                                                               temporal dependencies in the data, which are critical for
                                                               EEG signal analysis. Without convolutional layers, DNNs
                                                               are less equipped to handle the intricate patterns required
                                                               for this task, limiting their effectiveness in epilepsy
                                                               screening.
                                                                 While LightGBM exhibited strong initial validation
                                                               performance, its lack of convergence on unseen data,
                                                               evidenced by the disparity between training and
            Figure 5. Comparison of ROC curves for CNN and DNN models on
            testing  and  validation datasets.  The  CNN  model  achieved an  AUC  of   test  accuracies,  highlights  inherent  challenges  with
            0.86 on the test set and 0.87 on the validation set, while the DNN model   overfitting  in  gradient-boosting  frameworks  applied  to
            achieved an AUC of 0.84 on the test set and 0.82 on the validation set.   high-dimensional EEG data. Potential remedies such as
            The ROC curves illustrate the trade-off between the true positive rate   enhanced regularization (e.g., L1/L2 constraints, reduced
            (sensitivity) and the false positive rate, with the diagonal line representing   tree depth) or feature dimensionality reduction could
            random performance.
            Abbreviations: AUC: Area under the curve; CNN: Convolutional neural   mitigate these issues. Similarly, the underperformance
            network; DNN: Dense Neural Network; ROC: Receiver operating   of DNNs relative to CNNs suggests that architectural
            characteristic.                                    refinements  (e.g.,  integrating  convolutional  layers  or
                                                               attention mechanisms) might improve their ability to
            accuracy. If the differences between validation and testing   model spatiotemporal EEG dynamics. However, a central
            AUCs were large, it would indicate overfitting. In such cases,   tenet  of  ML  is identifying  and  optimizing  the  model
            the model might capture noise or dataset-specific patterns   class best suited to the problem’s intrinsic structure. The
            rather than generalizable features, leading to unreliable   CNN’s superior performance, stemming from its innate
            predictions in  real-world clinical settings. This  could   capacity to hierarchically extract localized spectral and
            result in missed seizures or false alarms, undermining the   temporal features without manual engineering, validates
            model’s utility in practice.                       its prioritization for epilepsy-related EEG analysis.
                                                               Accordingly, while theoretical avenues exist to improve
            4. Discussion                                      LightGBM and DNNs, CNN’s clinically aligned accuracy,
              This study evaluated the performance of three machine   generalizability, and convergence behavior justify its focus
            learning models – Light Gradient Boosting Machine   as the foundation for translational tool development.
            (LightGBM), DNNs, and CNNs – in classifying epileptic   CNNs demonstrated the best overall performance,
            and non-epileptic EEG signals. Each model demonstrated   achieving higher accuracy, AUC, and F1 scores compared
            distinct strengths and limitations, with CNNs emerging as   to DNNs and LightGBM. The confusion matrices for
            the most robust for this task.                     both validation and testing datasets confirmed a balanced
              Among  the  PyCaret  models,  LightGBM  achieved   classification of epileptic and non-epileptic cases, with
            the highest overall accuracy (85.9%) and AUC (0.91). Its   lower rates of misclassification. CNNs’ ability to process
            confusion matrices for both the validation and test datasets   spatial and temporal features makes them particularly well-
                                                                                                       27
            demonstrated a low rate of false positives and negatives,   suited  for  non-stationary  signals  like  EEG  data.   Their
            indicating strong precision and recall.  However,   convolutional layers extract hierarchical features, capturing
                                                22
            the  learning  curve  revealed  a  persistent  gap  between   both local patterns and broader dependencies, which are
            training  and validation performance,  indicating  a lack   crucial for detecting epilepsy. In addition, the CNN learning
            of convergence.  This suggests that while LightGBM fits   curves showed a better convergence trend than LightGBM,
                        23
            the training data well, it may not generalize effectively to   reflecting its ability to generalize more effectively.
            unseen data.  Gradient boosting methods like LightGBM   A critical consideration for the clinical deployment
                      24
            are known for their efficiency in structured data but can   of CNN-based EEG analysis is model interpretability.

            Volume 4 Issue 2 (2025)                        119                               doi: 10.36922/an.7941
   120   121   122   123   124   125   126   127   128   129   130