Page 11 - IJPS-2-1
P. 11
Anastasia Kostaki, Javier M. Moguerza, Alberto Olivares and Stelios Psarakis
where ε i are independent random variables with zero mean and constant variance. In order to esti-
mate the unknown function m at a point x, an averaging of the values of the response variable is lo-
cally done. The smoothness of the resulting estimator is controlled by a bandwidth determining the
width of the neighbourhood over which the averaging is performed. As a result, the estimator of the
function m takes the form:
ˆ m h ( ) x = n − 1 ∑ W h ( ; xX X 2 , , X Y ,
,
n
1
) i
where W h is a weight function depending on the bandwidth parameter h and variables X 1, X 2, , X n.
…
The shape of the weight function W h is represented by a so-called kernel function, which includes
the bandwidth h that adjusts the size and the form of the weights around x, acting as a scale parame-
ter. Hence, kernel regression estimators correspond to local weighted averages of the response vari-
able, with weights determined by the kernel function K, depending on the size of the weights on
the bandwidth parameter. Usually, for regression purposes, K performs and has the properties of a
probability density function: it is generally a positive, smooth function, decreasing monotonically as
the bandwidth parameter increases in size and peaking at zero.
A detailed review of the formulae proposed in the literature for the kernel estimator ˆ m of the re-
gression mean function m can be consulted in Peristera and Kostaki (2005), where it is shown that
the Gasser-Müller estimator (Gasser and Müller, 1979, 1984) is an adequate estimator for the
graduation of mortality data, its formula being:
n ( (i x + + ( ) )/2
1) x
) ,
ˆ m GM ( ) x = ∑ Y [ ] i ∫ i K h (x − x dx
i
i x −
i= 1 ( ( )i x + ( 1) )/2
th
where x 0 = –∞, x n = ∞, x i denotes the i largest value of the observed covariate values and Y [i]
the corresponding response value.
Regarding the selection of the bandwidth parameter, a description of techniques can be consulted
in Hardle (1990, 1991), and Peristera and Kostaki (2005). A typical way to select the bandwidth pa-
rameter is to build a direct plug-in estimator of the optimal smoothing parameter h. Gasser et al.
(1991) described how unknown quantities can be effectively estimated and explicit expressions for h
appropriate to the Gasser-Müller estimator are provided. The selection of a global or a local band-
width is another crucial decision. A local selection allows the use of a smaller bandwidth in areas of
high density, while for areas of low density a larger bandwidth can be adopted (Brockmann et al.,
1993; and Hermann, 1997, for discussions on the advantages of using kernel regression estimators
with a local bandwidth). The underlying idea of the plug-in method is to select the optimal band-
widths by estimating the asymptotically optimal mean integrated squared error bandwidths.
Hermann (1997) developed a generalization of the global iterative plug-in algorithm of Gasser et al.
(1991) for the selection of a local bandwidth, and the advantages of the local selection over the
global plug-in rule and the cross-validation method are shown.
4. Support Vector Machines
The SVM technique is part of the regularisation methods (Moguerza and Muñoz, 2006). These
methods also include Splines. In fact, there is a close relation between both methodologies — SVM
and Splines (Pearce and Wand, 2006). Next, we provide a brief description of the regression version of
SVM and its main features. SVM can be presented from its geometrical interpretation. Basically, the
method works by solving an optimization problem of the form (Tikhonov and Arsenin, 1977):
1 p 2
(
min ∑ Lf x i y i ) + M f ,
( ) −
∈
fH K p i= 1 K
n
where (x i, y i), i = 1, K, and p are a set of data with x ∈ℜ and y i ∈ℜ , L is a loss function, M >
i
0 is a constant that penalizes non-smoothness, H K is a space of functions known as Reproducing
Kernel Hilbert Space (RKHS) (Aronszajn, 1950; Moguerza and Muñoz, 2006), and ║f║ K is the norm
International Journal of Population Studies | 2016, Volume 2, Issue 1 5

