Page 194 - IJOCTA-15-1
P. 194
H.H. Yildirim, A. Akusta / IJOCTA, Vol.15, No.1, pp.183-201 (2025)
then aggregated for clustering analysis. This step first principal component, and the variance of this
involves summarising the data in a format that component is λ 1 .
′
facilitates comparison across firms and years. The second principal component, α x, is found by
2
′
′
Having calculated these volatility scores, we now maximizing var(α x) = α Σα 2 subject to being
2 2
move to the dimensional reduction step, which al- uncorrelated with the first principal component.
lows us to distill the most critical patterns from This condition is expressed as:
our multi-year dataset.
′ ′ ′ ′
3.4. Principal component analysis (PCA) cov(α x, α x) = α Σα 2 = 0 and α α 2 = 1 (7)
1
2
2
1
for dimension reduction
The optimization problem becomes:
Principal Component Analysis (PCA) is a sta-
tistical method that transforms a set of possibly
′ ′ ′
correlated variables into a set of linearly uncorre- maximize α Σα 2 − λ(α α 2 − 1) − ϕα α 1 (8)
2
2
2
lated variables called principal components. This
Differentiating concerning α 2 and solving similar
transformation is achieved so that the first prin-
yields:
cipal component captures the maximum variance
in the data, and each subsequent component cap-
tures the remaining variance subject to being or- Σα 2 = λα 2 (9)
thogonal to the previous components. 48
The first step in PCA involves finding a linear Here, λ is the second largest eigenvalue of Σ, and
combination of the original variables that maxi- α 2 is the corresponding eigenvector.
′ ′
mizes the variance. Let x = (x 1 , x 2 , . . . , x p ) be In general, the k-th principal component α x
k
′
a vector of p random variables. The aim is to maximizes var(α x) subject to being uncorrelated
′ k
determine a vector α 1 = (α 11 , α 12 , . . . , α 1p ) such with all preceding principal components. The
′
′
that the linear combination α x has the maxi- variance of α x is given by:
1 k
mum variance. This is represented as:
′
p var(α x) = λ k (10)
k
X
′
α x = α 1j x j (2) Where λ k is the k-th largest eigenvalue of Σ, and
1
j=1
α k is the corresponding eigenvector. This itera-
′
To maximize the variance of α x, we seek to max- tive process continues until all p principal compo-
1
′
imize var(α x). The variance of this linear com- nents are identified.
1
bination is given by: For this study, Principal Component Analysis
(PCA) was utilized as a critical step in dimen-
′ ′ sion reduction, focusing on the Parkinson volatil-
var(α x) = α Σα 1 (3)
1
1
ity scores calculated from 2006 to 2023. PCA is a
Where Σ is the covariance matrix of x. To find a statistical technique that simplifies the complex-
meaningful solution, we impose the normalization ity in high-dimensional data by transforming it
constraint: into fewer dimensions that retain most of the orig-
inal variance.
′
α α 1 = 1 (4) For our dataset, PCA was applied to the annual
1
Parkinson scores of each firm in the BIST100 in-
This leads to the optimization problem using a dex, excluding financial companies. The goal of
Lagrange multiplier λ: applying PCA was to reduce the dimensionality
of the 18 years of volatility data into a smaller
′ ′ number of more interpretable components that
maximize α Σα 1 − λ(α α 1 − 1) (5)
1
1
capture the most significant variances in volatility
Differentiating concerning α 1 and setting the gra- patterns across the firms over the years.
dient to zero yields the eigenvalue equation: The analysis resulted in the extraction of two
principal components. These components were
selected because they cumulatively captured the
Σα 1 = λα 1 or (Σ − λI p ) α 1 = 0 (6)
majority of the variance in the dataset, providing
Thus, λ must be an eigenvalue of Σ, and α 1 is the a clear and simplified view of volatility dynamics
corresponding eigenvector. The eigenvector as- across different companies over time. The sig-
sociated with the largest eigenvalue λ 1 gives the nificant eigenvalues of these two components also
188

