Page 120 - AN-4-2
P. 120

Advanced Neurology                                                          ML for EEG signal recognition



            each representing an EEG examination where each row   disabling a fraction of neurons during training. The final
            represents a different patient. It includes data from various   output layer utilized a sigmoid activation function to
            channels across the scalp, capturing different frequency   produce a binary classification output – epileptic or non-
            bands. The dataset comprises 668 columns, of which   epileptic. The DNN was trained using the Adam optimizer,
            253 columns represent specific frequencies  recorded  at   which dynamically adjusts learning rates to accelerate
            different scalp locations. Additional columns provide   convergence. The binary cross-entropy loss function was
            statistical metrics (e.g., mean and standard deviation) for   chosen to optimize classification accuracy. Training was
            specific frequency bands at various times. Local sleep-wake   conducted over 50 epochs with a batch size of 32, and early
            transition band values are also recorded. The target column   stopping was applied based on validation performance to
            indicates whether the individual examined was epileptic   prevent overfitting.
            (1) or non-epileptic (0). This comprehensive dataset was
            used to identify EEG features indicative of epilepsy and to   2.3.2. Convolution neural network
            develop predictive models capable of screening patients for   The CNN was designed to exploit the spatial and temporal
            epilepsy.                                          relationships within the EEG data. The model architecture
                                                               began with two convolutional layers, each equipped with
            2.2. Preprocessing                                 ReLU activation functions and configured with 32 and 64
            The dataset was preprocessed to ensure suitability for   filters, respectively. These layers were followed by max-
            machine learning modeling. Initially, the data were loaded   pooling operations to downsample feature maps and
            into the Python environment and the target column was   reduce computational complexity. Dropout layers, with
            separated  from  the  predictor  variables.  The  dataset  was   a rate of 0.3, were included to enhance generalization
            then  divided  into  three  subsets  for  training,  validation,   by preventing overfitting. The CNN further included a
            and testing, with proportions of 70%, 20%, and 10%,   flattening layer to convert the multi-dimensional feature
            respectively. Stratified sampling was employed to preserve   maps into a one-dimensional feature vector. This vector
            the original class distributions in each subset, ensuring   was processed by a series of dense layers, culminating in
            that both epileptic and non-epileptic cases were adequately   an output layer identical to that of the DNN. Training
            represented.  Predictor  variables  were  standardized  using   for the CNN followed a similar protocol, employing the
            a  StandardScaler (class from the Python library scikit-  Adam optimizer and binary cross-entropy loss function,
            learn) to normalize the feature values. This step was   with model evaluation occurring at each epoch to monitor
            critical for neural network models to achieve consistent   progress.
            convergence during training. For the convolutional
            neural network (CNN), the standardized predictor data   2.3.3. PyCaret machine learning
            was reshaped to include a channel dimension, enabling   The third approach involved the use of PyCaret, an
            the model to interpret each feature as part of a temporal   automated machine learning framework. PyCaret’s
            sequence.                                          classification module was configured to preprocess the
                                                               data, evaluate multiple machine learning algorithms, and
            2.3. Machine learning models                       identify the best-performing model based on a variety
            Three distinct machine-learning approaches were    of metrics. Following the automated model comparison,
            implemented  and  evaluated  in  this  study.  Each  model   the selected model underwent hyperparameter tuning
            architecture was carefully designed to leverage the   to optimize performance further. The final tuned model
            unique characteristics of the EEG dataset and optimize   was then evaluated on the test set, with key metrics such
            performance for epilepsy prediction.               as accuracy, precision, recall, and F1-score calculated to
                                                               assess its predictive capability.
            2.3.1. Dense neural network (DNN)
            The dense neural network (DNN) was constructed using   2.3.4. Model implementation and training details
            the Keras framework. This model comprised an input   All models were implemented in Python 3.8.10 using
            layer configured to match the dimensions of the predictor   TensorFlow 2.7.0 (Keras API) for neural networks
            variables, followed by three hidden layers. Each hidden   and PyCaret 2.3.10 for LightGBM. Hyperparameter
            layer employed rectified linear unit (ReLU) activation   optimization used: (1) manual grid searches to constrain
            functions to introduce non-linearity and enhance the   ranges, then (2) Keras Tuner (20 trials) for CNNs/DNNs,
            model’s ability to capture complex patterns in the data.   targeting kernel sizes (3 – 15  samples), dropout (0.2 –
            Dropout layers with a rate of 0.3 were incorporated after   0.5), and L2 regularization (1e-4 – 1e-2). ReLU activation
            each hidden layer to mitigate overfitting by randomly   (hidden layers) and sigmoid (output) ensured non-linear


            Volume 4 Issue 2 (2025)                        114                               doi: 10.36922/an.7941
   115   116   117   118   119   120   121   122   123   124   125