Page 85 - JCAU-7-1

P. 85

Journal of Chinese
Architecture and Urbanism Machine-simulated scoring of child-friendly streets

Figure 5. Image segmentation process. Source: Drawing by the authors

Each row of data represented the proportion of validation. The dataset covered nine feature categories along
correctly classified pixels for each label, allowing us to with a default scoring system for training purposes.
directly obtain pixel accuracy by summing the proportions Support vector machine models provided the capability
for all categories and averaging. Using this approach, we to handle non-linear decision boundaries, offering precise
calculated a pixel accuracy of 0.9794, indicating high and dependable predictions. They are particularly suited
accuracy and effective model performance on this dataset.
for regression tasks and effectively handle intricate datasets,
Achieving child-friendly streets requires continuous, a crucial advantage in urban design applications where
accessible walking paths and a safe walking environment high precision is of utmost importance. Model evaluation
(Global Designing Cities Initiative, 2019). The separation of involved calculating the mean square error for predictions
sidewalks from vehicle traffic was measured by identifying made on the test dataset, with the mean square error scores
fences or railings in the SVI data. Green spaces and parks utilized to evaluate the model’s performance and reliability.
were identified by highlighting trees and landscaped areas, Figure 6 illustrates the sequence of steps involved
creating a natural environment that contributes to physical in training and operating the prediction model. This
and mental health. In addition, social safety was evaluated by
identifying street lighting devices that contribute to creating workflow demonstrates the steps required to initiate the
safer urban environments for children and their caregivers. fully convolutional neural network model for SVI feature
extraction, followed by the support vector machine model
4.2. Machine-simulated human scoring model for generating scoring predictions. The interconnected
components and steps were designed to ensure robust
After collecting and processing SVI data, a fully convolutional and accurate model training, contributing to improved
neural network, and a support vector machine prediction prediction accuracy and system robustness.
model were developed to forecast scores for specific scenes
based on urban landscape characteristics. The model design The fully convolutional neural network model’s input
is based on the scoring framework for human-machine data comprised the first nine columns of the segmented
adversarial models proposed by Yao et al. (2019) and dataset, with the final attribute being the score predicted
Zhang et al. (2018). The prediction of human perception by the model. The dataset featured nine distinct elements,
is presented as a classification task. Support vector namely “people,” “building,” “sky,” “fence_railing,” “tree_
machine, known for its practicality and widespread use in plant_grass,” “road,” “sidewalk,” “streetlight,” and “car.” The
classification tasks, is used here to fine-tune the score range initial nine features served as input variables for the model,
from 0 – 10 based on a single sample, differing from MIT while the final feature represented the model’s target score.
Place Pulse’s binary classification format, which emphasizes The dataset was partitioned into a training set (80%) and a
comparative scoring (Zhang et al., 2018). The model used test set (20%) with the “random_state” parameter ensuring
in this project to predict safety perception was trained after reproducibility of the outcomes. The model, constructed
construction by referring to the datasets of Yao et al. (2019) using the TensorFlow Keras framework, employed a
and Han et al. (2022), specifically incorporating a dataset of sequential architecture with a fully connected layer and a
Shenzhen with real score annotations provided by volunteers dropout layer to reduce overfitting. It was trained for 10
for each image. In this research, we utilized 4,000 annotated epochs, with 20% of the training data reserved for model
SVIs to enhance feature extraction precision, dividing validation, using the Adam optimizer and mean square
these images into two sets: 80% for training and 20% for error as the loss function.

Volume 7 Issue 1 (2025) 8 https://doi.org/10.36922/jcau.3578

80 81 82 83 84 85 86 87 88 89 90