Page 43 - AJWEP-v22i3
P. 43
SpillNet CNN model for oil spill detection
phase involved output generation and performance accurate oil spill detection in SAR images. This
evaluation. Detailed explanations of the steps are architecture integrates multiple DSCL, BN, and RC to
provided in Sections 3.1-3.5. enhance feature extraction, learning, and classification
capabilities.
3.1. Data collection and pre-processing The modifications were implemented by adjusting
The data for this study, as detailed by Krestenitis et al., several architectural components of the CNN model,
32
consists of SAR images obtained from the Sentinel-1 including convolutional layers, residual blocks, filter
satellites, part of the European Space Agency’s sizes, data augmentation techniques (such as random
Copernicus program. These satellites are equipped with resizing, horizontal/vertical flipping, and cropping),
a C-band SAR system, providing wide coverage with regularization methods, and hyperparameter tuning. By
a pixel spacing of 10 × 10 m, and operate effectively introducing data augmentation, the size of the training
under various weather and illumination conditions. The dataset was increased, allowing the model to learn across
dataset includes 1,112 SAR images, each annotated a broad range of input parameters, thus fostering robust
with ground truth masks confirming the presence of oil classification. The modified CNN model comprised
spills, which were verified through records from the an initial and final convolution layer, with upsampling
European Maritime Safety Agency. 32 and downsampling layers in between. The purpose of
To address the challenge of class imbalance and multiple CNN layers was to allow the network to learn
enhance the model’s generalization capabilities, various complex data, promoting effective feature extraction,
data augmentation techniques were applied. These learning, training, and classification capabilities. The
techniques include random resizing, horizontal and different filter sizes employed at each layer allowed
vertical flipping, and random cropping of the images. the model to capture and extract features at various
This augmentation process helps create a more robust scales and layers. For instance, small filters capture
training dataset by introducing variability in the input fine features, while large filters capture more significant
images, simulating different scales and orientations of features, enabling the model to identify relationships
oil spills. between the images for improved classification.
The pre-processing pipeline, as implemented by Depthwise separable convolutions significantly
Krestenitis et al., involves several steps to ensure the reduce the number of parameters and the computational
32
SAR images are in a suitable format for the CNN model: burden without compromising performance. RC,
33
(i) Localization and cropping: Each confirmed oil spill introduced by He et al., facilitates the training of very
34
was localized according to the European Maritime deep networks by providing shortcut paths for gradient
Safety Agency records, and relevant regions were flow. This helps prevent the gradients from becoming
34
cropped from the raw SAR images too small as they propagate back through the network.
(ii) Rescaling: The cropped images were rescaled to a Downsampling reduces the spatial dimensions, allowing
resolution of 1250 × 650 pixels to standardize the the network to capture coarse features and reduce
input size computational complexity, while upsampling restores
(iii) Radiometric calibration: This step was applied to the original resolution, refining the extracted features
project the images into a common plane and correct for accurate reconstruction. 35
any variations in sensor readings The architecture components included the following:
(iv) Speckle noise filtering: A 7 × 7 median filter was (i) Input layer: Accepted SAR images with a specified
used to reduce speckle noise, which is common in shape of 224 × 224 × 3
SAR images (ii) Initial convolution layer (conv1): A 128-filter 7 × 7
(v) Conversion to real luminosity values: A linear convolutional layer with padding, followed by BN
transformation was applied to convert the images and ReLU activation to detect initial features
from decibel scale to real luminosity values, (iii) Residual blocks: Four residual blocks, each
ensuring consistent brightness levels across the containing two separable convolutional layers
dataset. 32 with 128 filters and 3 × 3 kernels, followed by BN
and ReLU activation after each convolution. The
3.2. Modification of the CNN architecture and block’s input was added to its output to maintain
algorithm design gradient flow
The proposed modified CNN architecture, termed (iv) Downsample layers: Downsample 1: A 256-filter 3
SpillNet, is specifically designed for efficient and × 3 convolution with stride 2, followed by BN and
Volume 22 Issue 3 (2025) 37 doi: 10.36922/ajwep.8282