Page 193 - IJOCTA-15-1
P. 193

Key drivers of volatility in BIST100 firms using machine learning segmentation
            robust and consistent dataset for our analysis, al-  outliers, that could distort the results. Addition-
            lowing us to draw meaningful conclusions about    ally, prices were adjusted for splits and dividends
            volatility patterns in the Turkish market.        to provide a consistent basis for comparison over
            This study consists of two steps. In the first step,  time. Finally, data points were cross-referenced
            the data set created for the study was grouped as  with other market sources to confirm accuracy.
            low-volatility and high-volatility companies using  This rigorous approach to data collection and pre-
            machine learning methods. In the second step,     processing from TradingView ensures that our
            panel regression analysis was performed to reveal  analysis is based on reliable and consistent data,
            the determinants of volatility.                   providing a solid foundation for the statistical and
                                                              machine learning techniques applied in the study.
            3.1. Data collection                              The resulting dataset not only aids in identifying
                                                              volatility levels among the selected firms but also
            The primary dataset source is TradingView, a
                                                              helps understand the impact of various financial
            reputable platform for financial data and trading
                                                              ratios on these volatility measures.
            insights. TradingView provides access to histori-
            cal stock price data, which is crucial for calculat-  With our adequately cleaned and prepared
            ing the volatility measures used in this research.  dataset, we can now proceed to the crucial step of
            The dataset includes daily closing prices necessary  measuring volatility across our sample of BIST100
            for computing Parkinson volatility scores, which  firms.
            are the fundamental variables in our analyses.
            The choice of daily closing prices aligns with es-  3.3. Volatility measurement
            tablished practices in volatility research and pro-
                                                              In financial terms, volatility represents the de-
            vides a balance between granularity and manage-
                                                              gree of variation in the trading prices of stocks
            ability of data. The data needed for the second
                                                              over a specific period. It is a crucial measure of
            stage of the study, the panel regression analysis,
                                                              risk and uncertainty in financial markets. For this
            was obtained from the Finnet database.
                                                              study, we focused on calculating annual Parkin-
            The reviewed literature highlights consistent de-  son volatility scores based on daily trading data
            terminants of stock price volatility, such as firm  of firms listed on the BIST100.
            size, dividend policy, leverage, and trading vol-
                                                              Parkinson Volatility Calculation: The Parkinson
            ume, analyzed through methods like GARCH and
            linear regression. 15,17–19  However, a gap exists  volatility measure, introduced by Michael Parkin-
                                                              son in 1980, uses a stock’s highest and lowest
            in applying machine learning techniques, such as
                                                              prices to provide a more accurate estimate of its
            PCA and K-means clustering, to analyze volatil-
                                                              volatility compared to traditional methods that
            ity in emerging markets like Turkey’s BIST100 in-                      47
                                                              only use closing prices.  The following formula is
            dex. Additionally, few studies combine machine    employed to measure volatility. In this formula,
            learning clustering with panel regression to ex-  (H i ) and (L i ) represent the highest and lowest
            plore financial ratios’ impact on volatility within  prices of the stock on the day (i), respectively,
            distinct firm groups.
                                                              while (n) denotes the number of trading days in
            This study addresses these gaps by integrating    the year.
            machine learning and econometric methods, offer-
            ing a novel approach to segment firms into low-
                                                                                                 σ {Parkinson}  =
            and high-volatility groups. By combining PCA
                                                              v
            and K-means clustering with panel regression, we  u (                                     2  )
                                                              u       1         1  X n            H i
            provide deeper insights into how financial ratios  t             ×               log
                                                                  4 log (2)     n    {i=1}         L i
            influence volatility within these groups. This con-
            tributes original value by enhancing the precision                                            (1)
            of volatility analysis in emerging markets and ex-
                                                              In order to calculate the annual Parkinson volatil-
            panding understanding of risk determinants in the
                                                              ity scores, each stock’s daily high and low prices
            Turkish context.
                                                              were aggregated for each year from 2006 to 2023.
                                                              For each firm and each calendar year in the
            3.2. Data processing
                                                              dataset, the Parkinson volatility formula was ap-
            Following data collection, a series of preprocessing  plied using all trading days of that year to cal-
            steps were undertaken to guarantee the accuracy   culate the annual volatility score. This method
            and consistency of the data set for subsequent    effectively captures intra-year price fluctuations,
            analysis. The data underwent cleaning to elim-    providing a nuanced picture of volatility. The an-
            inate inconsistencies, such as missing values or  nual volatility scores calculated for each firm were
                                                           187
   188   189   190   191   192   193   194   195   196   197   198