Page 80 - JCAU-7-1
P. 80
Journal of Chinese
Architecture and Urbanism Machine-simulated scoring of child-friendly streets
in more than 20 countries on four continents (Anguelov geared toward adults. Standard methods for predicting
et al., 2010). image labels include convolutional neural network
Since 2017, there has been a significant advancement in methods, which significantly outperform traditional
street-level image recognition research, with most scholars methods (Dubey et al., 2016). The fully convolutional
focusing on the quantitative assessment of urban spatial neural network is particularly useful for identifying
qualities, street morphology, and street-based human objects in images or segmenting street views, as it retains
activities (He & Li, 2021). The Massachusetts Institute of spatial information throughout the network. Support
Technology (MIT), for example, has leveraged street-level vector machines are another class of supervised learning
imagery to study greenery levels and sunlight exposure on models that perform well in classification and regression
streets (Salesses et al., 2013). Researchers at the University tasks (Naik et al., 2014; Ordonez & Berg, 2014; Porzi et al.,
of Connecticut introduced a technique to categorize 2015), particularly in processing high-dimensional space
different land-use types and landscape features through data such as city image features, which contain extensive
street-level imagery (Li et al., 2015). Meanwhile, the City visual information. Experiments in Ordonez & Berg’s
University of Hong Kong and Tsinghua University focus (2014) work demonstrate that even when trained and
on researching street canyon quality, with an emphasis tested across different cities, the support vector machine
on physical environments and human activities in high- model maintains a high degree of accuracy in predicting
density cities. These scholars have proposed a framework human perceptual attributes (such as wealth, uniqueness,
for studying cities from a human-scale perspective (He & and safety) in urban environments, demonstrating its
Li, 2021). generalization and robustness. Yao et al. (2019) have
proposed a deep learning-based human-machine
However, there is a scarcity of research explicitly
addressing children’s points of view, such as their safety adversarial framework that utilized a random forest-based
module to investigate the relationship between street view
and comfort. Torres (2020) advocates for the consideration elements and user scores.
of children and adolescents, along with their activities, in
street network design and development. He emphasizes Using these models, researchers have successfully
the necessity of creating a street environment where predicted human perception indicators in SVIs, such as
parents and guardians feel at ease without the need for safety, liveliness, and attractiveness (Zhang et al., 2019).
constant supervision to ensure their children’s safety. In addition, the linkages between street view elements and
Newly developed technologies and methods for analyzing user ratings reveal how visual elements affect residents’
street characteristics are particularly suited to studying perception of streets (Yao et al., 2019; Zhang et al., 2018).
factors influencing children’s safety, health, and well-being. However, previous studies have largely failed to consider
Relevant factors include sidewalks, crosswalks, traffic the distinct requirements of children, such as low sight
density, green spaces, and other environmental aspects lines, the need for safe play spaces, and heightened
that may affect children’s sense of safety, comfort, and sensitivity to traffic noise. Conducting surveys with
spatial awareness in the streetscape. children also poses unique challenges compared to those
Deep learning is a research tool inspired by the structure with adults. This research gap underscores the need for a
and function of the human brain. It enables computer dedicated approach to assessing child-friendly streets. For
vision technology to process large-scale street view images this study, we developed a new method, the “machine-
(SVIs) efficiently in a hierarchical manner, extracting simulated human scoring model,” to address the challenges
features and making predictions (Trichês Lucchesi et al., of assessing child-specific urban environment perception.
2023). Typically, a large number of annotated images This method combines computer vision segmentation
are required for training. To extract street elements, the and deep learning techniques, using an iterative feedback
most commonly used semantic segmentation models are mechanism to simulate the subjective perception of
SegNet (Badrinarayanan et al., 2017; Song et al., 2023), pedestrians in evaluating the spatial characteristics of
DeepLab (Nagata et al., 2020), FCN-8 (Kim et al., 2021), streets.
and Pyramid Scene Parsing Network (PSPNet) (Koo et al., To test the usability of the perceived score prediction
2022). This study used PSPNet due to its high accuracy in model, we used the Sham Shui Po district in Hong Kong
image segmentation and target detection tasks, as well as SAR, China, as the case study area. This district was
its robust performance in ensuring accurate and reliable chosen for its street block-based planning model (Hui,
street feature extraction (Zhao et al., 2017). 2015) and mixed demographics, including a relatively
Despite the trend toward incorporating user feedback large proportion of low-income residents and families with
in urban space design, most urban assessment tools are young children (Cheng, 2013).
Volume 7 Issue 1 (2025) 3 https://doi.org/10.36922/jcau.3578

