Page 74 - IJPS-8-2
P. 74
International Journal of
Population Studies Projecting sex ratio at birth in Pakistan
( ) g
α ' ( ) g , pt = if < 0, t t ort t 3p 1. For k ∈ {1,…,1000}, we select a set of left-out
>
observations {y ,… y ,…, y }. y is the only selected
0
k,6
k,p
k,1
k,p
left-out observation from province p and six provinces
Where, have left-out observations. Hence, we have y to y .
k,6
k,1
th
g
g
g
t 1 g t , t 2 g t 1 g , t 3 g t 2 g . 2. For the k set of left-out observations {y ,… y ,…,
k,p
k,1
y }, we can get the following results:
p
1
p
0
p
p
p
3
p
p
2
p
• k,6 Corresponding errors {e ,… e ,…, e } for these
g
k,1
g
g
g
p g , 1p , 2p , and 3p are the g posterior samples selected left-out observations. k,p k,6
th
• Median of this set of error: medium (e) .
of parameters ξ , λ , λ , and λ . • Coverage for this set: k Coverage
p
1p
3p
2p
6
Iy
e 1 I y l kp u . Here l and
k,p
kp
Appendix C. Model validation k 6 p1 kp , , kp, ,
The performance of the inflation model was evaluated by u correspond to the lower and upper bounds of
k,p
two approaches: (1) Out-of-sample validation and (2) one- the 95% prediction interval of the left-out
province simulation. observation y .
k,p
3. Compute the mean of these results for the 1000 set of
C. 1. Out-of-sample validation
observations:
We leave out 12% of the observations since the data • Corresponding errors {e ,…e ,…,e } for these
k,1
collection year 2018 instead of reference year, which has selected left-out observations. k,p k,6
been used for assessing model performance of demographic 1000
indicators largely based on survey data (Alkema et al., • Final median of error: 1 median e .
2012; Alkema et al., 2014; Chao et al., 2018a,b). There are 1000 1000 k 1 k
64 left-out observations from six Pakistan provinces. After • Final coverage: 1 coverage .
k
leaving out the data, we fit the model to the training dataset 1000 k 1
and obtain point estimates and credible intervals that For the point estimates obtained from the full and
would have been constructed from the available dataset in training datasets, we define the errors in the true SRB as
the selected survey year. Based on the training dataset, we e pt , pt , , where Θ is the posterior median
pt,
pt,
also generate the prediction distribution for each left-out in province p in year t obtained from the full dataset, and
observation. Θ is the posterior median in the same province-year
pt,
We calculate the median errors and median absolute obtained from the training dataset.
errors in the left-out observations. The errors are defined Similarly, the error in the sex ratio transition process
as e j = y j − y j , where y refers to the posterior median of with probability is defined as ( ) = α δ − α δ . The
αδ
e
j
p
, pt
, pt
p
, pt
the predictive distribution based on the training dataset for coverage is computed similarly to the left-out observations
the j left-out observation y . The coverage is given by 1/J and is based on the lower and upper bounds of the 95%
th
j
∑oI [y ≥ l] I [y ≤ u], where J refers to the number of left- credible interval of Θ from the training dataset.
j
j
j
j
out observations, and l and u correspond to the lower and pt,
j
j
upper bounds, respectively, of the 95% prediction interval C.2. One-province simulation
of the j left-out observation y . j We assess the inflation model performance in a one-
th
The validation measures are calculated for 1000 sets province simulation setting. We simulate SRB for a
of left-out observations where each set contains one province prior observing data. In this simulation exercise,
randomly selected left-out observation from each Pakistan we consider all observations as the test data and simulate
province. The reported validation results are based on the the SRB using the posterior samples of only the global
mean outcomes of the 1000 sets of left-out observations. parameters (instead of province-specific parameters)
This technique of validation exercise is used to reduce obtained from the sex ratio transition model using the full
the correlation of validation results within each province dataset. Hence, we simulate the SRB for a province without
and has been used in validation exercises in the previous data and check how well the simulated results can align
studies (Alkema et al., 2014; Chao et al., 2018a; You et al., and cover the SRB observations in each province.
2015). g
th
Specifically, the final validation results regarding the The g simulated SRB for a “new” province new t
left-out observations are calculated as follows: in year t are obtained as follows for g ∈ {1,…,G}:
Volume 8 Issue 2 (2022) 68 https://doi.org/10.36922/ijps.v8i2.332

