Optimizing Local Climate Zones through Clustering for Surface Urban Heat Island Analysis in Building Height-Scarce Cities: A Cape Town Case Study

Manyanya, Tshilidzi; Nethengwe, Nthaduleni Samuel; Verbist, Bruno; Somers, Ben

doi:10.3390/cli12090142

Open AccessArticle

Optimizing Local Climate Zones through Clustering for Surface Urban Heat Island Analysis in Building Height-Scarce Cities: A Cape Town Case Study

¹

Division of Forest, Nature and Landscape, Department of Earth & Environmental Sciences, KU Leuven, Celestijnenlaan 200E, 3001 Leuven, Belgium

²

Geography and Geo-Information Sciences, Faculty of Science, Engineering and Agriculture, University of Venda, Thohoyandou 0950, South Africa

³

KU Leuven Urban Studies Institute, KU Leuven, Parkstraat 45-3609, 3000 Leuven, Belgium

^*

Author to whom correspondence should be addressed.

Climate 2024, 12(9), 142; https://doi.org/10.3390/cli12090142

Submission received: 2 July 2024 / Revised: 15 August 2024 / Accepted: 4 September 2024 / Published: 10 September 2024

Download

Browse Figures

Versions Notes

Abstract

:

Studying air Urban Heat Islands (AUHI) in African cities is limited by building height data scarcity and sparse air temperature (T_air) networks, leading to classification confusion and gaps in T_air data. Satellite imagery used in surface UHI (SUHI) applications overcomes the gaps which befall AUHI, thus making it the primary focus of UHI studies in areas with limited T_air stations. Consequently, we used Landsat 30 m imagery to analyse SUHI patterns using Land Surface Temperature (LST) data. Local climate zones (LCZ) as a UHI study tool have been documented to not result in distinct thermal environments at the surface level per LCZ class. The goal in this study was thus to explore relationships between LCZs and LST patterns, aiming to create a building height (BH)-independent LCZ framework capable of creating distinct thermal environments to study SUHI in African cities where LiDAR data are scarce. Random forests (RF) classified LCZ in R, and the Single Channel Algorithm (SCA) extracted LST via the Google Earth Engine. Statistical analyses, including ANOVA and Tukey’s HSD, assessed thermal distinctiveness, using a 95% confidence interval and 1 °C threshold for practical significance. Semi-Automated Agglomerative Clustering (SAAC) and Automated Divisive Clustering (ADC) grouped LCZs into thermally distinct clusters based on physical characteristics and LST data internal patterns. Built LCZs (1–9) had higher mean LSTs; LCZ 8 reached 37.6 °C in Spring, with a smaller interquartile range (IQR) (34–36 °C) and standard deviation (SD) (1.85 °C), compared to natural classes (A–G) with LCZ 11 (A–B) at 14.9 °C/LST, 17–25 °C/IQR, and 4.2 °C SD. Compact LCZs (2, 3) and open LCZs (5, 6), as well as similar LCZs in composition and density, did not show distinct thermal environments even with building height included. The SAAC and ADC clustered the 14 LCZs into six thermally distinct clusters, with the smallest LST difference being 1.19 °C, above the 1 °C threshold. This clustering approach provides an optimal LCZ framework for SUHI studies, transferable to different urban areas without relying on BH, making it more suitable than the full LCZ typology, particularly for the African context. This clustered framework ensures a thermal distinction between clusters large enough to have practical significance, which is more useful in urban planning than statistical significance.

Keywords:

local climate zones; thermal environment; practical significance; hierarchical clustering algorithms

1. Introduction

Global population has almost tripled from 2.5 billion in 1950 to 7.3 billion by 2015 and past 8 billion in November 2022 [1,2]. Furthermore, the African continent is the second most populous in the world and is projected to be the most inhabited by 2100 [1,2]. With this global population rise, urban migration in Africa has grown from 27% in 1950 to 40% in 2015 and projected to 60% by 2050 [3]. Yet, with this continuous growth in African urban areas, less than 5% of global heat studies have focused on the African continent [4]. Natural population growth and urban migration results in the development of new infrastructure that is different from rural areas, which alters the urban climatic conditions at the mesoscale [5,6]. This alteration of the urban climate results in temperatures that are higher than the surrounding countryside and is called an Urban Heat Island (UHI), exacerbated by increased urbanisation [7].

The UHI is divided into three main components, where the uppermost component is at the Urban Boundary Layer (UBL), created by the heat coming up from urban roofs and street canyons, among others [8]. The UHI at the UBL has become an important subject of study due to its influence on urban airflow and pollution dispersion [9]. This includes effects on thermal turbulence, atmospheric stability, convective structures, nocturnal inversions, and mixed layer depth, all studied through T_air dynamics [10]. The mid-UHI component is below roof level but above the surface and is measured at screen height (2 m), thus incorporating the activity of winds blowing through the canyons, internal combustion engine cars and radiation from urban life, as well as population congestion. Urban Heat Islands also have a surface component studied through land surface temperatures (LST). The definition of LST accepted by the International LST and Emissivity Working Group (ILSTE) is based on [11] and is defined as “a measure of how hot or cold the surface of the Earth would feel to the touch”. Despite LST being a major component of urban micro and macro climatology, much of the research in urban temperature studies has focused more on the atmosphere than the surface. This is primarily due to the dynamic nature of T_air, which is influenced by multiple parameters that are difficult to quantify and predict [12]. Though less dynamic, the surface component of the UHI is significant in that it regulates urban ecology and heat fluxes [13].

A challenge in the African context, particularly in South Africa regarding UHI studies, is that the network of T_air stations is shallower by more than 50–100 km between official stations [14]. This becomes a limitation in studying air UHI (AUHI). However, LST data can be extracted from satellite imagery covering the entire city at a scale ranging from 10 m for Sentinel and 30 m for Landsat products. While it can be argued that the atmospheric component is the best indicator for overall UHI dynamics within a city, it is also known that atmospheric temperature is correlated with LST [15,16]. Surfaces heated by the sun warm the surrounding air, while cooling surfaces lead to a drop in T_air. LST tends to fluctuate more rapidly than air temperature, especially during the day when surfaces like asphalt heat up quickly. At night, these surfaces cool faster, resulting in lower LST than T_air [17]. Urban areas, with heat-retaining materials, often have higher LSTs, contributing to the Urban Heat Island (UHI) effect, where cities experience higher T_air than rural areas [18]. Weather conditions like cloud cover and wind can further affect this relationship by moderating or enhancing heat exchange between the surface and atmosphere [9].

Understanding LST is crucial for climate studies, as it plays a key role in local microclimates and the broader impacts of urbanisation and climate change, particularly in the context of surface heat islands. As such, in the absence of T_air data, the SUHI through LST becomes the way to understand UHI. An added advantage for studying SUHI over AUHI is that the level of detail that can be deduced from pixel based LST estimation is always greater than any available weather station network [19]. The atmosphere is warmed from the ground up and landscape-based UHI mitigation as well as global warming adaptation strategies aimed at bringing cooling to the atmosphere are implemented primarily on the ground [20]. As such, a study of SUHI provides the basis for an improved understanding of atmospheric heat dynamics within an urban area. Consequently, due to the limitations in weather station networks and the high availability of pixel scale LST data, the scope of this study was narrowed to the surface component of the UHI. This Surface Urban Heat Island (SUHI) is regulated by the ability of surface structures to absorb and reflect incoming solar radiation as well as emit long-wave radiation in the infrared region [21]. This depends on the colour of the surface feature material, ability to hold moisture, organic content, and proximity to shadow casting features, among others [22].

The most used methods of extracting LST from satellite imagery originate from applications of the inversion of Planck’s function, using the split window, or single channel algorithms [23]. While the methods of extraction are present, there is no universal method accepted by all scholars for quantifying the magnitude of the SUHI [24]. The two most used methods are Hot Spot Analysis, that calculates the Getis-Ord Gi* statistic as used for Cape Town by [25], which looks at a pixel and its neighbours to identify hot/cold spots, and the Basic Elimination (BE) method used by [26] that relies on the presence of similar classes in the surrounding rural area. Any attempt at quantifying the SUHI without reference material from the surrounding areas remains abstract and not universally acceptable as a correct representation. For this study, this challenge was not a factor because the interest was not in quantifying the magnitude of the SUHI in Cape Town, but to study and understand the spatial patterns and variability across the urban area in order to optimise the local climate zones-based methodologies for the best outcome specifically suitable for SUHI spatial analysis.

The local climate zones (LCZ) framework is universally accepted as the main method for studying these spatial variations that are attributed to the combined influence of the multiple complex physical parameters that regulate urban climatology. According to [27], the biggest factor regulating city scale SUHI spatial patterns is impervious surface area (ISA). In addition to ISA, the other important parameters that influence SUHI at the local scale are vegetation cover ratio, ground emissivity, as well as building density [28,29]. However, the local climate zones framework as a methodology specifically designed for studying UHI relies heavily on more parameters than those stated by [27,28]; the main one of which is building height, which distinguishes between low-, mid-, and high-rise members of the compact and open groups. The limitations experienced in African urban areas to readily provide height data for proper LCZ classification compromises the accuracy with which the built classes are discriminated, particularly from remote sensing-based methods. It is thus important to investigate the extent to which this height dependent framework can be applied to study heat islands in an area where height data are not readily available [30]. Previous studies on SUHI using the LCZ framework state that variables relating to surface cover are more important in determining surface heat distributions; however, these were still studies performed where the LCZ classification was performed with high-resolution building height data available [31]. As such, the classification without height data was not isolated and tested for its SUHI applications. In the African Context this is of particular importance to investigate how the lack of building height affects the capacity of the framework to assess SUHI. Furthermore, this also provides the basis for assessing height independent modifications that can be made to the framework so that it best explains the observed surface heat patterns without being compromised by missing building height data.

To assess the framework’s ability to explain the observed LST patterns, the study was split into the following objectives. (1) To map LCZ over Cape Town across all four seasons of the year 2020 with and without building height; (2) compute and extract LST values for each LCZ map; (3) to analyse the spatial LST trends and distinctiveness of each LCZ’s surface thermal environment across all seasons; (4) To isolate the role of building height in the creation of LCZ surface thermal environments; (5) To optimise the LCZ framework through clustering for distinct surface thermal environments independent of building height.

2. Materials and Methods

2.1. Materials

2.1.1. Study Area

The study focused on the Metropolitan City of Cape Town, in the Western Cape province of South Africa. Cape Town as shown in Figure 1 is at the southernmost tip of the African continent, with a Mediterranean type of climate [25].

The overall design of Cape Town as an urban area is not regular nor symmetrical around the CBD due to the ocean that is a barrier on its southern, southeastern, and southwestern sides. As such, the area progressing outward from the CBD on the oceanside is made up of primarily residential areas, while the landward sides are made up of other land uses. The presence of Table Mountain at the centre of Cape Town cuts off the natural development of the city, thus contributing to the lack of symmetry in Cape Town’s design, as is the case with Rio de Janeiro, Hong Kong, and Kabul [32,33,34]. The Majority of Cape Town urban land uses are residential, further away from the CBD. There exist patches of other land-use zones, such as manufacturing and industrial, located at the Victoria and Alfred Waterfront harbour as well as the Airport region. While a basic classification on Cape Town already exists based on function, for this study we aim at arriving at a classification based on LCZ and which will be optimised for studying the complexities of SUHI.

2.1.2. Data

Landsat 8 Operational Land Imager (OLI)–Thermal Infrared Sensor (TIRS) images were used for the purposes of this study. As shows in Figure 2, Landsat 8 has 11 bands across a wavelength spectrum of 452–12,510 nm ranging from visible to thermal with three additional bands compared to its predecessors; namely, coastal (Band 1) at 400 nm, the cirrus cloud band at 1360 nm, and dual thermal bands between 10,600 and 12,510 nm [35]. The sensor is also equipped with a 15 m spatial resolution panchromatic band (B8) and 30 m resolution for the rest of the multispectral bands. The temporal resolution for both satellites is 16 days.

These images were taken for Summer (Nov–Feb), Autumn (Mar–May), Winter (Jun–Aug), and Spring (Sep–Oct) 2020 at 0–5% cloud cover, taking the image from the middle of the season.

2.2. Method

2.2.1. LCZ Classification

A Random Forests (RF) classifier performed in R was chosen for the classification of Cape Town into LCZs. According to the LCZ framework, an urban area can be divided into 17 classes which have distinct thermal environments from each other at the local scale [37]. This framework divides the urban area into 10 built and 7 natural land cover classes as shown in Figure 3, based primarily on building density, land cover, and imperviousness [38]. Further discrimination between the classes happens based on building height (BH) to distinguish between low-, mid-, and high-rise, canyon width (W), canyon aspect ratio (BH/W), building surface fraction (BSF), impervious surface fraction (ISF), as well as pervious surface fraction (PSF) [39].

Within the LCZ framework, there exists two parent groups that are further subdivided by building height; these are the compact (high-, mid-, and low-rise) as well as the open (high-, mid-, and low-rise) groups [40]. The sampling of polygons for the RF was a desktop based systematic sampling performed on Google Earth pro, Google Street View and Qgis based on parameters of BH, W, BSF, ISF, and PSF for each LCZ present in Cape Town. These polygons were circular with a radius of 100 m in size, which is the minimum size of a local climate zone patch [38]. These training and validation polygons were verified on site through visual observation, with height confirmed via counting the number of floors per building. This verification is particularly important for the mid- and high-rise LCZ polygons [40]. This is because building height cannot be accurately estimated from the 2D desktop remote environment. The in situ verified polygons were randomly split and 70% were used for model training.

Four different data stacks were prepared for the RF classifier. The first data stack was only raw data (Landsat Images), while the second stack was raw data and spectral indices of Normalised Difference Vegetation Index (NDVI), Normalised Difference Built-up index (NDBI), and Enhanced Built-up and Bareness Index (EBBI). NDVI is calculated from the near-infrared and red bands of the satellite image product, and it is an indicator for vegetation presence and greenness [41]. The NDBI is calculated from the shortwave and near-infrared bands to discriminate built environments from natural land cover [42]. The index of EBBI is calculated from shortwave, near, and thermal infrared bands to map built and barren areas [43]. A third RF protocol was carried out by running a neighbourhood function over the second stack classification. A neighbourhood function is a moving kernel that evaluates a pixel relative to a predefined number of neighbours and aggregates them to a single class based on majority of occurrences [44]. This solves the salt and pepper effect that creates fragmentation of the LCZ classes in the output. A fraction (30%) of the initially ground-truth polygons was used to validate the classification. The validation process returned parameters of overall accuracy (OA), producer accuracy (PA), user accuracy (UA) and Cohen’s kappa value (Kappa), derived from a confusion matrix. Based on these outputs, secondary parameters were also calculated, those being the overall accuracy in urban areas, overall accuracy in natural classes, as well as the ratio of urban to natural.

Due to the absence of citywide LiDAR coverage, building height was not included in the classification stack initially used for applying the Random Forest classifiers to discriminate LCZs. As has been shown by previous studies, acceptable accuracies of LCZ classification can be achieved even without building height, which is scarce in the African context [45]. However, since this study goes beyond LCZ classification and highly depends on each class being accurately classified, it is imperative to assess the impact of incorporating building height into this framework before drawing conclusions about LST. Limited coverage (<10% of Cape Town) height data was obtained from the City of Cape Town (CoCT) municipality databases covering the southeastern side captured during the summer of 2020. Therefore, a fourth data stack was made by adding a building height layer to the second stack. This classification was then compared for accuracy with the initial one conducted without building height data.

From the limited LiDAR coverage, four square areas of 4 km by 4 km, each containing 16 LiDAR tiles, were sampled as shown in Figure 4. The selection of adjacent tiles within each block allows for classification of an integrated environment large enough to encompass multiple local climate zones. Using the 2 m LiDAR point cloud data, a digital surface model (DSM) was generated by subtracting the elevation from the digital elevation model (DEM) created from ground-tagged points of the same dataset. This resulting raster layer was added to the classification stack along with spectral indices, necessitating the creation of a new training set for this classification.

Due to the lack of accessible building height data in African urban areas, most LCZ classification studies are conducted without a building height layer. Without evaluating an LCZ protocol that includes building height, any conclusions regarding the thermal distinctiveness of LST in each LCZ will carry the uncertainty of whether the observed patterns are marred by propagated errors from imprecise classification due to the lack of building height or other factors related to the framework itself. Therefore, including a building height stack, however limited, in coverage is essential to distinguish between implementation errors and framework errors.

2.2.2. LST Extraction

The retrieval of LST values from satellite imagery can be performed using various methods based on the number of channels the thermal sensor contains on the space borne platform [46]. The commonly used methods fall under two main categories, which are the use of radiative transfer equations (RTEs) and the inversion of Planck’s function, both applied on single channel (SCA) or split-window (SWA) algorithms. The challenge with single channel methods using RTE is that they rely on atmospheric profiles which must be estimated at the same time as the fly-over of the satellite, which are not always readily available [15,47]. While Landsat sensor-based approaches do not require atmospheric profiles, Landsat 7 and its predecessors only have single-channel thermal sensors, which makes SC methods the best applicable, as opposed to Landsat 8, used in this study, which has two thermal channels and is thus suitable for SWA. A comparative study by [47] as well as previous studies by [46], however, proved that Planck’s function on a single channel algorithm yielded the best LST approximation over the other viable approaches both for Landsat 7 and 8, despite having dual thermal bands. As such, for this study, a single channel inversion of Planck’s function was chosen for Landsat 8 using band 10 over band 11 due to its better calibration and higher resistance to noise [48].

This inversion of Planck’s function is based on a mathematical combination of compound parameters calculated based on the individual bands from the Landsat image product and was performed on the Google Earth Engine (GEE). This is due to the capacity of the GEE to handle the downloading of raw data, pre-processing, processing, and extraction all in one interface without additional data. The input parameters include the Normalised Difference Vegetation Index (NDVI), Top of Atmosphere (TOA) Radiance, brightness temperature (BT/Tsen) emissivity of the land surface, Vegetation and Soil, Vegetation Fraction, and Cavity Effect [38], Figure 5; Table 1. The outcome of this algorithm is an estimate of the LST value for each 30 m-by-30 m Landsat pixel to produce a surface temperature map for each sampled image.

2.2.3. Spatial Trends in LST and Thermal Variability within and between LCZ Classes

Systematic sampling was used to extract LST values within each LCZ. This was carried out with a point grid of 100 m inter-point distances corresponding with the minimum size of an LCZ according to [38]. However, there is no control over the location of the pixel within each LCZ. As such, a 3-pixel buffer was created for each point and only the pixels that had a buffer that was still completely within the same LCZ were selected, to minimise edge effect and ensure pixel independence to ensure that the analysis is robust. This also guarantees that each point is representative of its LCZ’s location, since it is at least 100 m within the LCZ on all sides.

On top of a visual analysis which enables a spatial pattern visual analysis of LST distribution across each LCZ, bar graphs and boxplots were employed as diagnostic tools to investigate the LST data distribution between and within classes for each season. This allowed for the examination of seasonal patterns between and within LCZs, enabling the isolation of seasonal influences. The interquartile range (IQR) indicates where 50% of the data within each LCZ lie, with a smaller IQR suggesting greater homogeneity of surface material within the class. Conversely, a larger IQR indicates heterogeneity in configuration and makes it harder to predict values using machine learning predictors based solely on the mean and median. However, while visualising LST means and understanding their distribution is informative, it does not provide a statistical basis for concluding whether each LCZ can be considered a distinct thermal environment. Another metric displayed by the box plots is the standard deviation, giving a reflection of the range of LST values within each individual LCZ.

2.2.4. Statistical and Practically Significant Thermal Environment Distinction between LCZs

A One-way Analysis of Variance (ANOVA) test was conducted at a 95% confidence level to explore the influence of LCZ on LST by examining mean disparities. While ANOVA offers insights into whether there are statistically significant differences among LCZ means, it does not identify specific significant pairs. To delve deeper, a post hoc Tukey’s Honest Significant Difference (HSD) test was implemented with a 95% confidence interval. Despite the Tukey HSD test using p-values to establish statistical significance, it is crucial to acknowledge that statistical significance may not always align with practical significance, as argued by [50,51]. Conventional statistical inference may fail to capture the intricacies of certain natural processes and dynamic systems, potentially overplaying their practical implications [52]. Hence, notwithstanding statistical significance, it is vital to consider practical significance, particularly when interpreting temperature differences within LCZs. This approach seeks to offer a nuanced comprehension of the relationship between LCZ and LST, considering both statistical and practical significance using Tukey p-values and through adopting a 1 °C minimum threshold. Previous studies have proved that a 1 °C change in LST can lead to substantial UHI intensity; this can also significantly impact energy consumption, and cause thermal stress in vegetation [53,54,55]. This threshold was also suggested by [56] for making urban design plans based on the potential for significant climatological impact. They argued that landscape-based UHI mitigation strategies are built on practical implications and not statistical significance. For this study, we adopted this 1° C as the decider between practical significance and insignificance.

Role of building height on LCZ thermal environment distinctiveness.

For the stack classified with building height, Tukey’s ANOVA-based HSD test, coupled with the temperature threshold of 1 °C and a confidence interval of 95%, was employed to ascertain whether each LCZ exhibits a unique LST thermal environment compared to the other LCZs. This aims to determine whether the enhanced classification and differentiation of height levels results in practically distinct thermal environments for these classes. If proven practically distinct across the board, it would indicate that building height significantly influences SUHI beyond statistical significance. This would overall highlight that any disparities in the lack of distinction in surface thermal environment are not due to the framework but propagated from the lack of building height. Conversely, if the improved classification fails to yield practically distinct thermal environments, it suggests that the increased classification accuracy resulting from building height inclusion does not differentiate thermal environments in classes solely differing in building height, thus pointing to the framework itself as being limited in creating thermally distinct LCZs within the open and compact groups.

2.2.5. Local Climate Zone Optimisation for Statistical and Practical Significance

Upon analysing the ANOVA and Tukey HSD results for both with and without height LCZ-LST pairs, we undertook a detailed LCZ clustering analysis. This was conducted to create an LCZ clustering with the maximum number of LCZ clusters constituting individual thermal environments. This involved utilising both semi-automated and automated clustering statistical techniques. Semi-Automated Agglomerative Hierarchical Clustering (SAAC) using the hclust package in R leveraged known physical LCZ features to classify individual classes into clusters based on their physical characteristics [57]. Meanwhile, the fully automated Divisive Hierarchical Clustering (ADC), implemented via the NbClust package in R, utilised temperature patterns to classify LCZs into clusters [58,59,60]. While the agglomerative SAAC yielded clusters at different levels based on provided features, the divisive ADC employed 26 built-in indices to estimate the optimal number of clusters before assigning points to their respective clusters using Ward’s method [61].

The 26 indices for the fully automated divisive clustering algorithm can be categorised into several subgroups based on the data pattern criteria they evaluate. These subgroups include indices based on cluster compactness and separation, which assess how tightly grouped the data points are within clusters and how well-separated the clusters are from each other [62,63]. Another subgroup includes indices based on within-cluster variance, which uses it to evaluate their compactness [61,64]. There are also indices that specifically address cluster separation, focusing on the distinctness of the clusters [65]. Several indices use specific statistical criteria to evaluate clustering performance [66]. Indices based on point-biserial correlation and similar methods evaluate the correlation between cluster membership and distances within the data [63]. Additionally, pseudo-statistical methods are utilised by some indices for clustering validation. Lastly, there are specific indices used for clustering evaluation purposes. This categorisation aids in understanding the different criteria used by each index and in selecting the appropriate ones for specific aspects of clustering evaluation. The resulting clusters were evaluated for distinctiveness using Tukey HSD at a 95% confidence interval and a 1 °C threshold for practical significance.

The observations in local climate zones application, particularly in the African context, has been that while the general appearance of a LCZ is given in the framework, the local urban form introduces some variations in that universal appearance. In a previous study of LCZ adaptation, [45] discovered that there were differences in how the same LCZ appears across different urban areas. While this difference is not large enough to push the LCZ into an altogether difference class, it is enough to shift around physical characteristics of density, composition and, to some extent, configuration. As such, a choice of a semi-automated clustering that allows for these nuances to be incorporated into the optimising process for each urban area makes this approach transferable. Meanwhile, adding a fully automated clustering algorithm that works based solely on data structures makes this approach robust in that is it not solely based on physical characteristics of each LCZ but also the internal data structures in the LST.

3. Results

3.1. LCZ Classification

In line with the findings presented in our previous publication [45] where we conducted an extensive analysis of LCZ classification with Cape Town as one the study areas, the current study extends this groundwork by employing the same Random Forest methodology. Despite using Landsat 8 for this current study as opposed to Sentinel 2A, we arrived at similar conclusions, particularly regarding the spring season being the most accurately classified across all seasons. Overall accuracy (OA) values are consistently higher when using spectral indices (SI) compared to using only raw Landsat 8 imagery, across all seasons. Specifically, in summer, OA increased from 32% (Raw) to 41% (SI), and further to 59% when including spectral indices and a 5-pixel neighbourhood function (SI-NH) as shown in Table 2. This trend is consistent across autumn, winter, and spring, with SI-NH consistently yielding the highest OA values. The increase in accuracy with the inclusion of spectral indices and spatial context (SI-NH) indicates the importance of both spectral diversity and spatial information in improving classification performance.

The OA for urban classes (OA-U) and for natural LCZs (OA-N) also reflect the superiority of SI-NH over other stacks. In spring, OA-U improved from 24% (Raw) to 42% (SI) and reached 51% with SI-NH, while OA-N saw an impressive jump from 68% (Raw) to 78% (SI), and up to 91% with SI-NH. This pattern holds true across other seasons, demonstrating that the inclusion of spectral indices and spatial context enhances both the accuracy of urban and natural LCZs. Kappa statistics, which measure agreement beyond chance, also follow this trend, showing the highest values with SI-NH across all seasons, highlighting the robustness of this method for LCZ classification.

Analysing the results seasonally, the classification performance varies, with spring showing the highest overall accuracy across all methods, particularly with SI-NH achieving 61% OA, compared to 59% in summer, 48% in autumn, and 44% in winter. This seasonal variation can be attributed to differences in vegetation phenology, atmospheric conditions, and landscape visibility throughout the year, which affect the spectral signatures captured by Landsat 8. The lower accuracy in winter could be attributed to more cloud cover and less distinct spectral differences between land cover types. The substantial improvement in accuracy when incorporating spectral indices and spatial information (SI-NH) across all seasons underscores the critical role these factors play in mitigating seasonal variability and enhancing the precision of land cover classification in urban environments like Cape Town. The consistently high OA-N values, particularly in spring and summer, highlight the model’s proficiency in distinguishing natural LCZ, while the improvements in OA-U across seasons point to better urban classification with enhanced spectral and spatial data. The seasonal and methodological variations in the classification results emphasise the importance of using advanced techniques to achieve higher accuracy in LCZ mapping. These findings suggest that for accurate urban and natural LCZ classification in Cape Town, integrating spectral indices and spatial information is essential.

Across all classified images, visual patterns arise which agree with the corresponding LST image mapped from the same Landsat product, as can be seen in the spring images (Figure 6). The Table Mountain area is heavily dominated by LCZ 11 (A–B/dense vegetation) and LCZ 13 (C/shrubs). These are heavily and mixed vegetated classes as would be expected in an area like Table Mountain. The harbour area around the V&A waterfront is dominated by large open (LCZ 8) and compact mid-rise (LCZ 2); this is expected for an area that is composed mostly of warehouse structures and city residentials. The agricultural lands of Philippi show a domination of LCZ 9 (sparsely built), expected for an area where houses are far apart. Khayelitsha and Gugulethu areas are dominated by LCZ 7 (lightweight) which is typical of the slums. The forested Table Mountain area (LCZ A and B) shows low LSTs, while the waterfront (Table Bay Harbor, LCZ 10/industrial) as well as the Epping industrial (LCZ 10) regions are on the higher end of the LST spectrum. The LCZ 8 (large open) warehouses at the city centre as well as the shopping complexes around the Tokai and Westgate regions also show high surface temperatures, like those which can be observed in the highly clustered residential zones such as LCZ 3 (compact low-rise). The area with the lowest LST appears to be the inland water bodies as well as the surrounding ocean. This is, however, just a visual observation-based assessment of spatial LST patterns; a statistical process is still needed to make significant and viable assessments.

3.2. Spatial Trends in LST and Thermal Variability within and between LCZ Classes

The point grid extraction process yielded a total of over 96,000 pixels across all 14 LCZs in Cape Town, highlighting variations in spatial coverage among different zones. LCZs 9 (sparsely built), 13 (shrublands), 14 (low plants), and 3 (compact low-rise) emerged as the most prominent classes, boasting larger areas characterised by a higher number of pixels. In contrast, LCZ 1 (compact low-rise) exhibited the smallest coverage, comprising only 126 pixels across the entire set of points. LCZ 9 stood out as the most extensive class, encompassing 15,479 pixels. The disparities introduced by differences in class size were minimised by using averages instead of individual values to ensure robust analysis.

The results from the bar graphs used to visualise the LST means revealed similarities across certain LCZs, despite their belonging to different classes (Figure 7). Notably, LCZ 8 (large open) consistently exhibited the highest mean LST values across all seasons, with its highest mean being at 37.2 °C in the summer, followed closely by LCZ 7 (lightweight) and LCZ 2 and 3 (compact mid- and low-rise, respectively). During the warmer seasons of spring and summer, LCZ 3 is always warmer than LCZ 2; however, the difference between their means is extremely small across all seasons. LCZ 2 has higher buildings and is thus more shaded. As such, while the two LCZs are comparable in density and composition, the shaded patches of LCZ 2 would result in overall lower LSTs. While the difference between LCZ 2 and 3 is small, they are both significantly larger in mean LST than LCZ 1 with a difference of 4 °C in the summer and 5 °C in the spring. These are all LCZ within the compact parent cluster only separated by building height. While it can be argued that the low LST experienced in LCZ 1 (compact high-rise) is due to shading by tall buildings, the difference of 4–5 °C is larger than the 1–2 °C found in a previous study which had more LCZ 1 representation and thus better classification accuracy [67]. In this study, LCZ 1 (compact low-rise) was the least represented and least well classified LCZ, suggesting that the large observed differences could be bias propagated from the sampling. Throughout the seasons, LCZs characterised by homogeneous natural features, such as LCZ 17 (water) and vegetated LCZs 11 (A and B) have the lowest temperatures. However, the temperature gap between the hottest LCZ (large open/8) and the coolest (LCZ 17) is wider in the spring and summer (14 °C) than it is in the autumn and winter (4 °C).

The bar graphs show consistent structure in the mean LST rankings across the seasons, thus indicating non-randomness, suggesting that the inter LCZ-LST patterns remain the same regardless of the seasonal transitions in the regional and local climate.

An intra-LCZ diagnosis of the LST values using boxplots reveals that the classes that show the highest mean LSTs have smaller interquartile ranges (IQR), which means that the values are concentrated around the mean with small standard deviation (SD), indicating that the mean is representative of the entire class population. In the Spring, LCZ 7 has an SD of 1.85, and 50% of its points are located between 34 °C and 36 °C, while LCZ 11, which is on the cooler side, has a SD of 4.2 with an interquartile range of 17 °C to 25 °C (Figure 8). The open (LCZ 5 and 6) and sparsely built (LCZ 9) classes have wider IQRs and larger standard deviations, as indicated by the box whiskers. This is indicative of heterogeneity in the physical properties that make up the surface features in these classes. The natural LCZs have the widest IQR and largest standard deviations.

The widest IQR is that of LCZ 1, ranging from as low as 3 °C in the winter and autumn months to 16 °C in the spring and summer months. While 3 °C seems like a reasonable IQR, comparing it with its fellow cluster members, it is more than twice their IQR, while in the warmer months it is more than five times their IQR. This could be indicative of spatial variety in features, but LCZ 1 was also the least well classified spectrally due to it only occupying a small fraction of Cape Town. Compact high-rise (LCZ 1) will thus be regarded as an anomaly and removed from further analyses of temperature to avoid its bias being propagated. The boxplots reveal also that there is more dynamic variation in the natural classes during all seasons while the built LCZs seem to not vary greatly in LST during the winter and autumn months. While this provides information regarding the LST dynamics within each LCZ it does not show whether each LCZ can be regarded as having its own statistically significant unique thermal environment.

3.3. Statistical and Practically Significant Thermal Environment Distinction between LCZs

The spatial patterns observed from the bar and boxplots across multiple images indicate non-randomness in that they are repeated through the seasons though in varying intensities. This is particularly observable with LCZ 8 (large open) constantly being the hottest and LCZ 17 (G/water) constantly being the lowest, as well as most of the others except (LCZ 16/F/bare–paved) constantly maintaining their relative positions through the seasons. To some extent, we may even be able to observe relationships between certain LCZs such as LCZ 8 and 2 (compact mid-rise), LCZ 7 (lightweight) and 10 (industrial), LCZ 3 (compact low-rise) and 2, LCZ 5 (open mid-rise) and 6 (open low-rise) or 9 (sparsely built) and 14 (D/low plants). However, these inter- and intra-LCZ-LST patterns do not statistically explain the tendency of certain LCZ pairs to overlap and display what seems to be similarities in thermal environments. Moreover, it does not elucidate the role of building height in shaping thermal environments in a methodology that rests on building height as a basic parameter for within-cluster LCZ differentiation. This is even more crucial because the box and bar plots reveal that LCZ 3 and 2 and LCZ 6 and 5 have minimal differences in mean LSTs, and these are only separated by building height. Performing an ANOVA at a confidence interval of 95% revealed that overall, the LCZs are significantly different, with p-values of less than 0.05 for all seasons (Figure 9). We thus rejected the Null Hypothesis for all seasons.

The mean of the sum of squares (MS) indicates that between-group differences are highest in spring (40,358.51) and lowest in winter (4883.25) (Table 3). Additionally, the overall-high MS values in the warmer periods suggests that temperature variations between local climate zones are highest during these periods, confirming what the bar graphs already revealed, which directs studies on SUHI effects to prioritise warmer periods, as these are the seasons where LCZs play a major role in LST patterns. While the ANOVA proves that there is some statistical significance between LCZ surface temperature variations, it does not give details about which pairs are significantly different and which ones are not.

Tukey’s HSD post hoc analysis performed on the dataset revealed that while the ANOVA overall proves statistical significance across the entire dataset, some LCZ pairs denoted by a p-value of more than 0.05 for a 95% confidence interval show differences that are not statistically significant. Across all seasons, the observed differences range from a minimum of 0.09 °C between LCZ 9 (sparsely built) and 14 (low plants) in the spring to a substantial 13 °C between LCZ 17 (low plants) and 2 (compact mid-rise). While statistical significance denotes non-random differences, it does not automatically imply practical significance [50]. Statistical significance should always be interpreted according to the nuances of the type of research and field standards in order to translate to clinical/practical significance [68]. Adopting a practical effectiveness threshold of 1 °C allows for conclusions beyond statistical significance. LCZs within the same parent cluster (open/compact) characterised by similar building density and configuration exhibit mean LST differences below this threshold, suggesting negligible practical significance. For instance, the 0.2 °C difference between compact LCZs 3 (compact low-rise) and 2, though statistically significant, falls short of the practical threshold. Conversely, open–compact LCZ pairs display larger mean LST differences, exceeding those within the same parent cluster (open/compact) that are only differentiated by building height. Notably, the 1.08 °C mean LST difference between LCZ 6 (an open cluster) and LCZ 3 (a compact cluster), both low-rise, is five times that of LCZ 6 and 5 (open mid-rise) and that of LCZ 3 and 2, thus exalting the influence of similarities in physical characteristics over building height.

Distinct natural LCZ, such as LCZ 17 (G/water) are clearly differentiated from all the other LCZs with differences ranging from 2 °C when paired with LCZ 5 (open mid-rise) in the winter which is only 20% of the fraction of the value it takes when compared to the same LCZ in the spring (11 °C). The densely vegetated LCZ 11 (A–B) is also quite well differentiated across all seasons, with the smallest difference in the spring being 3 °C when compared with LCZ 5 and the higher end being 10 °C when compared with large open (LCZ 8). All the pairs that show differences that are not statistically significant (red, Figure 9) or those that might be statistically significant but not practically significant (blue, Figure 9) can be categorised into three groups: (i) those within the same parent cluster (open/compact) separated only by building height (e.g., LCZ 6 and 5 with a difference of 0.18 °C in the spring and 0.29 °C in the summer), (ii) those in different clusters but with shared similarities in feature density, composition, or configuration (e.g., LCZ 8 and 3 in the spring with a difference of 0.05 °C, LCZ 7 (lightweight) and 10 (industrial) in the winter and summer with differences of −0.73 °C and −0.93 °C, respectively, LCZ 7 and 6 in the spring, with a difference of 0.43 °C, LCZ 9 and 6, with a difference of −0.34 °C, and LCZ 9 and 14), and (iii) those with patterns that defy explanation without further investigation into other possible LST drivers, such as LCZ 16, whose thermal environment seems to not be significantly different from most other LCZs. This is most visible in the summer Tukey plot where the LST difference between LCZ 16 and 8 and LCZ 16 and 14 are both not statistically significant, as well as with LCZ 16 as compared to LCZ 10 and 3, which proves to be statistically significant but not practically significant (Figure 9). Some of the pairs whose differences in thermal environments are neither statistical nor practically significant are reflected in the LCZ classification confusion matrices (Figure 10). This implies a similarity in spectral properties between these pairs.

Notably, the absence of statistically and practically significant mean differences between LCZ 8 (large open) and 3 (compact low-rise) is striking. Both LCZs are characterised by low-rise structures and mostly impervious surfaces with little to no vegetation. This similarity suggests a potential misclassification due to shared composition and configuration, also supported by the confusion matrix (Figure 10). Furthermore, LCZ 10 (industrial)’s confusion with LCZ 8 and 3 raises additional questions about classification typology. These findings suggest that shared composition and configuration may indeed lead to analogous thermal environments, warranting further examination into the classification process and its potential limitations. While the lack of distinct, practically significant thermal differences between LCZ 8 and 3 and 7 (lightweight) and 6 (open low-rise) can be expected due to shared physical properties, the lack of significant differences between LCZ pairs 14 and 2 (−0.14 °C, winter) and 13 and 3 (−0.5 °C, winter), respectively, seems to defy convention. LCZs 14 and 13 are both purely vegetated while LCZ 3 and 2 (compact mid-rise) are highly compact. These patterns also do not recur through the seasons and can thus be considered erroneous. Additionally, the colder months show thermal environments’ similarities between LCZ 16 and 8, 13 and 10, 11 and 7, and 11 and 5. These are local climate zones that are completely different. However, what the autumn and winter reveal is that the surface temperature range during these months is very small as compared to the spring and summer seasons. This suggests that even LCZs that share very few physical characteristics would have comparable thermal environments during these seasons, even furthermore implying that the LCZ typology would not have much of an effect in creating distinct surface thermal environments during the cooler seasons.

Across all the seasons, land pixels belonging to LCZ 2 (compact high-rise) are classified as LCZ 3 and those belonging to LCZ 5 (open mid-rise) are classified as LCZ 6 (Figure 10). LCZ 3 and 2 are identical in composition and density but only differentiated by building height. This same observation is observed in the Tukey HSD output where the LST difference between LCZ 3 and 2 falls before the practical significance threshold at 0.2 °C in the spring and 0.8 °C in the winter. Similarly, LCZ 6 and 5 exhibit the same patterns, with 0.18 in the spring which even falls below the statistical significance 0.05 p-value threshold, and 0.29 °C in the summer. The nuances of intra-cluster variation without a practical threshold value would indicate that each of these LCZs has its own distinct thermal environment. However, the presence of a threshold for practical significance clusters each parent cluster into a single distinct thermal environment of its own. This, however, opens the platform to wondering whether the observed lack of differences in LST means is due to the two local climate zones constituting the same thermal environment or if it is a classification error propagated into the LST analysis due to the lack of building height in the classification stack.

Building Height and Within-Cluster Thermal Environments

The spring image LCZ classification with height yielded a Kappa coefficient that was a 9% improvement on the classification without height. The producer’s accuracy (PA) over the high classes (LCZ 2 and LCZ 5) is considerably higher with height than without. According to this PA, there is a 60% probability that any spot chosen at random belonging to LCZ 2 on land is correctly classified on the heighted map as compared to the 36% of the un-heighted map, and similarly 50% versus 30% for LCZ 5 (Figure 11). The inclusion of height in the data stack does not only result in a better overall classification of the mid-rise classes but also less confusion with the low-rise classes.

Another round of Tukey’s HSD based on the heighted Spring 2020 image revealed that the difference between LCZ 3 and 2 has increased from 0.2 °C to 0.65 °C, which means the improved classification has improved their distinction in thermal environment; however, it is still not a practically significant difference (Figure 12). The individual LCZs within the open cluster (LCZ 5 and 6) also have an increased LST mean difference, from 0.18 °C to 0.84 °C, which is still not practically significant as they both fall below the 1 °C threshold. This suggests that while building height improves the surface thermal environment distinction of LCZs within each parent cluster, it is not enough to make them practically distinct. Moreso, the confusions that are observed with the un-heighted Tukey HSD between LCZs that share physical characteristics and those within each of the two-building height-dependent parent clusters are also observed in the heighted Tukey HSD. Both LCZ 3 and 2 are not distinct in thermal environments when compared with LCZ 8, with which they share imperviousness with LST differences of 0.08 °C and −0.56 °C, respectively. Similarly, LCZ 5 is still not thermally distinct from LCZ 7 at −0.05 °C. This suggests that the creation of distinct surface thermal environments rests on parameters that are more applicable across the entire LCZ spectrum than only building height, which is only restricted to the two parent clusters. As with the expected, there is also the unexpected, such as the confusion between LCZ 13 and 2 and also 13 and 5 at 0.80 °C and −0.86 °C, respectively. These are classes that do not have physical characteristics in common.

The fact that most of these LST patterns can be explained either through field observations or the known thermal properties of certain materials only proves that the LCZ framework is still a useful approach to UHI studies. However, it also proves that the typology, as it is, has too many physical property overlaps, which result in the LCZs not being thermally distinct from each other even during summer and spring periods when there is a wide range of temperatures on the surface. This suggests that the framework needs to be reimagined and optimised for optimal applicability in the study of LST patterns to inform planning processes.

3.4. Local Climate Zone Optimisation for Statistically and Practically Distinct Thermal Environments

Having shown that the improvement in the classification that results from the inclusion of building height in the classification protocol does not solve the problem of lack of distinct thermal environments between groups that are within the same parent cluster (compact/open), we employed a semi-automated, as well as a fully automated, statistical clustering method to cluster the LCZ into groups. The semi-automated method clustered LCZs based on similarity between physical characteristics of composition (vegetation/impervious/water/material) according to a density scale of “none”, “low”,” medium”, and” high”. The automated statistical method used a total of 26 indices to determine the maximum number of clusters based on natural patterns in the LST data and assigned each observation point to a cluster.

3.4.1. Semi-Automated Agglomerate Hierarchical Clustering (SAAC)

Tukey HSD results revealed similarities in LCZ surface thermal environments based on imperviousness (LCZ 8 and 3, LCZ 8 and 2, LCZ 7 and 10), vegetation cover (LCZ 11 and 13, LCZ 9 and 14) as well as properties of water presence and construction material. Based on these properties, a matrix was created and imported into the algorithm in R to cluster the LCZs using decision trees at different levels to create a dendrogram (Figure 13). At level 3, the algorithm clustered the open LCZs (5 and 6) together with LCZ 16 (E and F) which were observed to have similar thermal environments.

The second level of the dendrogram pairs together LCZs which were also observed to have similar thermal properties from the Tukey HSD. The algorithm separated the industrial looking LCZs (7 and 10) into their own cluster and separated them from the other mostly impervious LCZs (3 and 2). The highly vegetated (LCZ 11 and 13) LCZs were grouped together, and the sparsely vegetated (LCZ 9 and 14) were grouped together. However, LCZ 17 (G/water) was taken as its own cluster at level one. Local climate zone 16 was also taken to be its own cluster at level one. However, due to the repeated confusion of LCZ 16 observed both spectrally and thermally with the open clusters (LCZ 5 and 6), we interpreted LCZ 16 at level three where the algorithm combined it with the open cluster. Notably, the agglomerative algorithm at the top of the dendrogram has one cluster made of the built LCZ classes while the other cluster is made of the natural classes and LCZ 9. During the ground truthing work, LCZ 9 (sparsely built) and 14 (low plants) were most observed to be quite similar in appearance. What this suggests is that the semi-automated process captures the nuances of physical similarities observed in the field and has an advantage in the ability to be interpreted at different levels with relevance to the objective of the study.

3.4.2. Automated Divisive Hierarchical Clustering (ADC)

Through analysing temperature data across different LCZs, ADC iteratively divided the dataset into smaller clusters using Ward’s method, ensuring that each resulting cluster represented a homogeneous group with similar thermal characteristics. The ultimate goal was to create clusters that not only have statistically significant temperature differences but also possess practical significance in terms of their impact on local climate conditions. The optimal number of clusters from the 26 built-in indices was six, which was picked by the Kurtosis L-method (KL) index and Hartigan index. The KL identified six as the point where adding additional clusters does not improve the statistical significance while the Hartigan used the intra- and inter-cluster variance to select six as the number of clusters that are optimal for the nature of the LST data. Ward’s algorithm then assigned each of the observations into these clusters (Figure 14). This was tested on the spring image, as the most thermally diverse season, as well as the winter image, as the one where there is the least variation in LST.

Local Climate Zone 2 (67%) and 3 (57%) are assigned to the same cluster (6). The same cluster also has the majority of LCZ 8 and 16. Local Climate Zone 7 is equally split between cluster 3 and cluster 6. Cluster 6 mostly contains the open built (LCZ 5 and 6) classes, while cluster 3 contains the compact (LCZ 2 and 3) built classes. The natural clusters are separated, with each having its own cluster. Water (LCZ 17) is assigned to cluster 5, while LCZ 11 and 13 are assigned to cluster 4 and 2, respectively. LCZ 9 and LCZ 14 are put together into cluster 1. The biggest difference between the outcome of the automated clustering and the semi-automated is that the former puts each natural class, except LCZ 14, into its own cluster, while the later assigns the densely vegetated (LCZ 11/A–B) and shrub classes (LCZ 13/C) to the same cluster.

However, as observed in the Tukey HSD test, the clustering for the winter month has groupings of LCZs which cannot be explained either by physical properties of by similarity of construction material. LCZ 2 and 3 are clustered together in cluster 1, while LCZ 5 and 6 are clustered together in cluster 5; this can be explained by the fact that we showed that building height does not have as great an impact in creating distinct thermal environments as differences in composition, density, and configuration do. It is thus logical for LCZ classes that are similar in those three physical properties to be assigned to the same cluster. However, LCZ 17 (water) and dense vegetation (LCZ 11) are clustered together in the cluster with the lowest temperatures. This, however, makes some logical sense, since it was also observed in the ranking of the means throughout the seasons that these two LCZs tend to have the lowest average temperatures, which can be attributed to evapotranspiration in the case of dense vegetation and thermal properties in the case of water. There is, however, clustering of LCZs such as 8 and 14 which cannot be simply, logically explained. However, this is also in a cluster with the widest range in terms of temperature. This is due to Ward’s method being sensitive to outliers. While the overall interpretation of ADC should favour the majority of the points within a particular LCZ, the minority that is assigned to other clusters helps us to explain patterns in the Tukey HSD which cannot be explained by physical characteristics, as observed in the heighted Tukey HSD between LCZ 13 and 2 and LCZ 13 and 5. These are classes that have no physical characteristics in common; however, the Tukey HSD showed that they had differences in their means that were below the threshold. The differences were not as small as those between LCZ 3 and 2 or LCZ 6 and 5; however, they were still not statistically significant, and could not be explained. Even though the ADC clusters LCZ 13 into its own cluster, it also classifies a fraction of points in LCZ 13 to cluster 3, which contains LCZ 2, and cluster 6, which contains LCZ 5. As such, the sensitivity of Ward’s method in ADC is an advantage is that it highlights secondary and tertiary thermal characteristics within a local climate zone and associates those characteristics with other LCZs.

3.4.3. Testing the SAAC and ADC Outputs for Thermal Environment Distinctness

The output from the semi-automated clustering was put through another round of Tukey HST and compared with the output of the spring automated clustering and tested on the spring data where the LCZ was grouped into clusters based on the findings of the algorithms (Table 4)

The results revealed that both final outputs create clusters that are both statistically and practically significant (Figure 15). Even though four out of the six clusters have slightly different members, they still constitute distinct thermal environments. While the automated ADC method gives more detail in terms of the thermal nuances within each LCZ, the explanation for similarities still rests on the clustering made by the SAAC. This suggests that even with a clustering that is solely based on temperature properties, the semi-automated classification that is based primarily on physical properties can help with the interpretation of the outputs on the automated class.

In order to arrive at an LCZ mapping that resembles classes with distinct surface thermal environments, a clustering is recommended, which in this case for Cape Town results in six clusters (Figure 16). While some of the clusters may differ, both outcomes are statistically viable for distinct thermal environments.

The ability to only arrive at practical significance for urban planning using a clustered LCZ group of six clusters makes this a more applicable framework to study SUHI and LST patterns than the standard LCZ framework, particularly without building height. This is a version of the framework with large enough LST differences to inform urban policy.

4. Discussion

Using Image classification techniques for LCZ mapping is advantageous for its ability to cover larger spatial extents without the time-consuming nature of manual digitising [69]. However, the ground-truthed polygons for model training and validation only represent a fraction of the entire area, meaning the pixels outside these polygons are subject to misclassification depending on the model’s training quality [70]. High classification accuracy is crucial in studies like this one, where the classified image forms the basis for further analysis. While spectral indices and neighbourhood functions for LCZ classification, combined with randomly selected points for LST assessment, enhance the robustness of the methods, it still does not eliminate the probability for error in the analysis. The inclusion of a building height layer in the fourth stack still retained a 28% error margin. This indicates that even with all the discriminatory parameters from the LCZ classification protocol met according to [37], there remains a probability of error, highlighting the need to reduce the classification error to minimise its impact on subsequent analyses. Ref [71], having noted the limitations of LCZs in creating distinct surface thermal environments, suggested that hyperspectral imagery might be the solution towards eliminating LCZ classification errors that are propagated onto the LST analysis. These are, however, commercial products with limited access [72].

The analysis of LST mean patterns based on this classification shows that the built-up LCZs tend to be significantly warmer than the natural classes, as expected based on previous studies [26,73]. Within the built-up classes, the individual LCZs in the compact and open clusters, respectively, were proven to have interchangeable patterns and overlapping interquartile ranges across all periods of analysis. This indicates homogeneity within each cluster [74]. This suggests that building height, which is the only parameter that separates the high-, mid- and low-rise LCZs, does not result in a large temperature difference between the means of these classes. Similar to a study on Beijing by [75] focusing on exploring vertical dynamics of LST, our study found that the temperature at the top of a mid-rise building tends to be lower than that at the top of a low-rise LCZ. Across all seasons, compact mid-rise was lower in mean LST than compact low-rise. However, the Tukey HSD test reveals that these LST mean differences are below the 1 °C threshold. As such, the difference in the mean LSTs among the LCZs within the compact or open cluster is not large enough to have each individual LCZs regarded as a practically distinct thermal environment. Ref [75] observed that the difference between temperature in compact mid-rise (LCZ2) and low-rise (LCZ3) was minimal. Therefore, the presence of a building height layer in the classification stack does not significantly change the LST patterns observed even without building height. This leads to a conclusion that even within the compact clusters where building height is the only separator between low-, mid-, and high-rise LCZ, it still does not make a significant contribution towards creating distinct thermal environments.

A study on Indonesia by [76] showed that building height as a parameter significantly affects T_air more than LST. This is attributed to the street canyons that redirect and amplify the winds, as opposed to the ground, which is heated from above by incoming solar radiation. The standard LCZ typology as it is has been criticised when it comes to SUHI studies for exhibiting a limitation in that it does not account for the similarity in physical characteristics between LCZs, which can potentially result in similar surface thermal environments [74]. These are characteristics such as composition and homogeneity of material. This does not only explain the similar spatial LST patterns discovered within the compact and open clusters individually, but also across the entire typology. Local Climate Zone pairs 8 and 10, LCZ 2 and 3, LCZ 6 and 9, LCZ 9 and 14(D), and LCZ 7 and 3 have been documented to potentially have similar surface thermal environments on the basis of their physical similarity in composition and homogeneity [71]. The same similarity was revealed in our study by spectral confusion between these classes (Figure 10) and by the similarities in thermal behaviour (Figure 12). Ultimately, the post hoc Tukey HSD test showed that even on the basis of actual LST mean analysis, each of the member LCZ of these pairs was not found to have a thermal environment distinct from the other member of the pair (Figure 12). This isolates contributors of homogeneity and composition as having a much greater role in the creation of distinct surface thermal environments than building height across the entire typology.

While this limitation in the LCZ framework that results in similarities between thermal environments has been documented by a number of studies, they propose solutions which include making the framework more complex. Ref. [74] proposed the inclusion of socio-economic factors and details about material composition. Ref. [71] proposed an improvement in the LCZ classification by using hyperspectral imagery, which, in the African context, is not practical for all urban areas, thus making it a rather inaccessible approach. Ref. [75] suggested incorporating aspects of urban morphology into the framework; these are aspects such as variation in building height within each individual class as well as building footprints. The approach of [75] rests on availability of height data which is also a limitation in the African context. The approach used in our study of simplifying the framework instead of making it more complex provides a solution that is viable for unique surface thermal environment estimation and accessible in the African context, where building height and ultra-fine hyperspectral imagery is not readily accessible.

Studies by [28,77] have proven that the surface temperature patterns are governed primarily by the impervious surface ratio and vegetation cover above all other parameters. Most studies that have focused on SUHI have gone on the theoretical assumption that local climate zones automatically result in distinct thermal environments without investigating the validity of the claim as it pertains to surface applications. The use of SAAC as well as ADC to arrive at the number and composition of LCZ clusters for distinct thermal environments keeps the methodology robust enough to be adapted to studies in different climates in cities with different numbers of LCZs present. The adoption of a flexible, semi-automated cluster is necessitated by the observation of [71,75] that the uniqueness of local urban form customises the appearance of each LCZ for each urban area. In a study on LCZ mapping in South Africa, [45] discovered that LCZ 6 (open low-rise) in Cape Town is closer in appearance to LCZ 9 (sparsely built), but in Thohoyandou it is closer to LCZ 3 (compact Low-rise). This is due to the street canyons, erf sizes, and presence of vegetation. Another observation in the [45] study was that LCZ 10 in Cape Town has more impervious surfaces than in the more countryside, smaller urban areas, where it has elements of bare soils, suggesting that in the smaller urban areas, LCZ 10 might not be clustered by the SAAC into the same cluster with LCZ 7 where it has been assigned for Cape Town. Thus, the reliance of the SAAC on an adaptable input matrix of categorical parameters for each LCZ allows the city-specific unique urban form to be captured into the framework. The ADC, with its internal clustering processes based on the LST data for that urban area is equally able to make clusters not based on an existing theoretical database but the actual data for that specific area. This combination of algorithms thus captures city-specific urban forms as well as the unique microclimates they create.

5. Conclusions

While the study indicates no significant practical distinction in mean LST between LCZ classes separated solely by building height, this does not diminish the importance of building height in the urban environment. Building height can influence microclimates, primarily through shading effects caused by taller buildings, which create localised areas of lower LST [78]. These shading effects do not create mean LST values large enough to render the mid- and high-rise LCZs as distinct thermal environments from other classes separated from them in physical characteristics by only building height. These shading effects can still be relevant for small-scale thermal variation studies other than UHI focused urban planning, which is guided by policies established upon practical LST differences over statistical significance [79]. This makes the LCZ framework a versatile and invaluable tool to study urban dynamics. The methodology developed in the study, which clusters LCZs, addresses the limitations of distinct surface thermal environments noted by previous studies [71,74] while keeping the individual LCZs intact. This leaves room for the individual LCZs to be used to study processes that are sensitive to temperatures below the 1 °C threshold while also being collectively used in clusters to study SUHI and inform urban planning for climate sensitive development. The clustered version of the framework avoids building height related error propagation by not having the distinctions between low-, mid-, and high-rise. This makes this method uniquely suitable for applications in areas where LCZ is often classified without building height due to limited LiDAR accessibility.

Author Contributions

Conceptualization, T.M., B.S. and B.V.; Data curation, T.M.; Formal analysis, T.M.; Funding acquisition, B.S., B.V. and N.S.N.; Methodology, T.M., B.S. and B.V.; Software, T.M.; visualisation, T.M.; Writing—original draft, T.M.; Writing—review and editing, T.M., B.S. and B.V. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by VLIR-UOS, the Flemish University Council for University Development Cooperation through the ReSider project and the Flemish government funded SAF-ADAPT project. Funding number: 000000166183.

Data Availability Statement

Data are available on request from the authors.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. LCZ Maps for All Seasons

Figure A1. LCZ classified images for summer, autumn winter and spring 2020.

Appendix B. LST Maps for All Seasons

Figure A2. Extracted LST for Cape Town for summer, autumn, winter and spring 2020.

References

He, W.; Goodkind, D.; Kowal, P.R. An Aging World: 2015; United States Census Bureau: Washington, DC, USA, 2016.
Jain, N.; Kourampi, I.; Umar, T.P.; Almansoor, Z.R.; Anand, A.; Rehman, M.E.U.; Jain, S.; Reinis, A. Global population surpasses eight billion: Are we ready for the next billion? AIMS Public Health 2023, 10, 849–866. [Google Scholar] [CrossRef] [PubMed]
Zu Selhausen, F.M. Urban migration in east and west Africa since 1950. In Migration in Africa; Routledge: London, UK, 2022; pp. 281–307. [Google Scholar]
Zhou, D.; Xiao, J.; Bonafoni, S.; Berger, C.; Deilami, K.; Zhou, Y.; Frolking, S.; Yao, R.; Qiao, Z.; Sobrino, J.A. Satellite remote sensing of surface urban heat islands: Progress, challenges, and perspectives. Remote Sens. 2018, 11, 48. [Google Scholar] [CrossRef]
Romero, J.P.; Bottega, A.; Cordeiro, A.B. The impact of demand on innovation and research intensity. Int. Rev. Appl. Econ. 2023, 37, 217–235. [Google Scholar] [CrossRef]
Spence, M.; Annez, P.C.; Buckley, R.M. Urbanization and Growth; World Bank Publications: Chicago, IL, USA, 2008. [Google Scholar]
Li, X.; Stringer, L.C.; Chapman, S.; Dallimer, M. How urbanisation alters the intensity of the urban heat island in a tropical African city. PLoS ONE 2021, 16, e0254371. [Google Scholar] [CrossRef] [PubMed]
Barlow, J.F.; Halios, C.H.; Lane, S.E.; Wood, C.R. Observations of urban boundary layer structure during a strong urban heat island event. Environ. Fluid Mech. 2015, 15, 373–398. [Google Scholar] [CrossRef]
Oke, T.R. The heat island of the urban boundary layer: Characteristics, causes and effects. In Wind Climate in Cities; Springer: Dordrecht, The Netherlands, 1995; pp. 81–107. [Google Scholar]
Lauwaet, D.; De Ridder, K.; Saeed, S.; Brisson, E.; Chatterjee, F.; van Lipzig, N.; Maiheu, B.; Hooyberghs, H. Assessing the current and future urban heat island of Brussels. Urban. Clim. 2016, 15, 1–15. [Google Scholar] [CrossRef]
Guillevic, P.; Göttsche, F.; Hulley, J.; Ghent, G. Land surface temperature product validation best practice protocol. Version 1.1. In Best Practice for Satellite-Derived Land Product Validation; Elsevier: Amsterdam, The Netherlands, 2018; Volume 60. [Google Scholar]
Payne, A.E.; Demory, M.-E.; Leung, L.R.; Ramos, A.M.; Shields, C.A.; Rutz, J.J.; Siler, N.; Villarini, G.; Hall, A.; Ralph, F.M. Responses and impacts of atmospheric rivers to climate change. Nat. Rev. Earth Environ. 2020, 1, 143–157. [Google Scholar] [CrossRef]
Chakraborty, S.D.; Kant, Y.; Mitra, D. Assessment of land surface temperature and heat fluxes over Delhi using remote sensing data. J. Environ. Manag. 2015, 148, 143–152. [Google Scholar] [CrossRef]
Kruger, A.C.; Sekele, S.S. Trends in extreme temperature indices in South Africa: 1962–2009. Int. J. Climatol. 2013, 33, 661–676. [Google Scholar] [CrossRef]
Li, Z.-L.; Wu, H.; Duan, S.-B.; Zhao, W.; Ren, H.; Liu, X.; Leng, P.; Tang, R.; Ye, X.; Zhu, J.; et al. Satellite remote sensing of global land surface temperature: Definition, methods, products, and applications. Rev. Geophys. 2023, 61, e2022RG000777. [Google Scholar] [CrossRef]
Mutiibwa, D.; Strachan, S.; Albright, T. Land surface temperature and surface air temperature in complex terrain. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2015, 8, 4762–4774. [Google Scholar] [CrossRef]
Cao, J.; Zhou, W.; Zheng, Z.; Ren, T.; Wang, W. Within-city spatial and temporal heterogeneity of air temperature and its relationship with land surface temperature. Landsc. Urban Plan. 2021, 206, 103979. [Google Scholar] [CrossRef]
Zhang, Y.; Zhang, J.; Zhang, X.; Zhou, D.; Gu, Z. Analyzing the characteristics of UHI (Urban heat island) in summer daytime based on observations on 50 sites in 11 LCZ (local climate zone) types in Xi’an, China. Sustainability 2020, 13, 83. [Google Scholar] [CrossRef]
Lin, H. Urban heat island distribution observation by integrating remote sensing technology and deep learning. Int. J. Image Data Fusion. 2024, 1–17. [Google Scholar] [CrossRef]
Ferrero, L.; Gregorič, A.; Močnik, G.; Rigler, M.; Cogliati, S.; Barnaba, F.; Di Liberto, L.; Gobbi, G.P.; Losi, N.; Bolzacchini, E. The impact of cloudiness and cloud type on the atmospheric heating rate of black and brown carbon in the Po Valley. Atmos. Chem. Phys. 2021, 21, 4869–4897. [Google Scholar] [CrossRef]
Peng, S.; Piao, S.; Ciais, P.; Friedlingstein, P.; Ottle, C.; Bréon, F.-M.; Nan, H.; Zhou, L.; Myneni, R.B. Surface urban heat island across 419 global big cities. Environ. Sci. Technol. 2012, 46, 696–703. [Google Scholar] [CrossRef]
Martin, P.; Baudouin, Y.; Gachon, P. An alternative method to characterize the surface urban heat island. Int. J. Biometeorol. 2015, 59, 849–861. [Google Scholar] [CrossRef]
Wang, L.; Lu, Y.; Yao, Y. Comparison of three algorithms for the retrieval of land surface temperature from Landsat 8 images. Sensors 2019, 19, 5049. [Google Scholar] [CrossRef]
He, T.; Zhou, R.; Ma, Q.; Li, C.; Liu, D.; Fang, X.; Hu, Y.; Gao, J. Quantifying the effects of urban development intensity on the surface urban heat island across building climate zones. Appl. Geogr. 2023, 158, 103052. [Google Scholar] [CrossRef]
Beuster, L.R.N. Urban Heat Islands in South Africa: A Case Study of Cape Town. Doctoral Dissertation, Stellenbosch University, Stellenbosch, South Africa, 2019. [Google Scholar]
Wang, C.; Chang, H.-T. Hotspots, heat vulnerability and urban heat islands: An Interdisciplinary Review of Research Methodologies. Can. J. Remote Sens. 2020, 46, 532–551. [Google Scholar] [CrossRef]
Liu, J.; Li, J.; Qin, K.; Zhou, Z.; Yang, X.; Li, T. Changes in land-uses and ecosystem services under multi-scenarios simulation. Sci. Total Environ. 2017, 586, 522–526. [Google Scholar] [CrossRef] [PubMed]
Baldinelli, G.; Bonafoni, S. Analysis of albedo influence on surface urban heat island by spaceborne detection and airborne thermography. In Proceedings of the New Trends in Image Analysis and Processing—ICIAP 2015 Workshops: ICIAP 2015 International Workshops, BioFor, CTMR, RHEUMA, ISCA, MADiMa, SBMI, and QoEM, Genoa, Italy, 7–8 September 2015; Proceedings 18. Springer: Berlin/Heidelberg, Germany, 2015; pp. 95–102. [Google Scholar]
Shi, Y.; Xiang, Y.; Zhang, Y. Urban design factors influencing surface urban heat island in the high-density city of Guangzhou based on the local climate zone. Sensors 2019, 19, 3459. [Google Scholar] [CrossRef] [PubMed]
van der Waal, B.; Grenfell, S.; Huchzermeyer, N.; Schlegel, P. Selecting and refining suitable methods of developing digital elevation models for wetlands in data-scarce environments. Wetl. Ecol. Manag. 2022, 31, 539–550. [Google Scholar] [CrossRef]
Alexander, C. Influence of the proportion, height and proximity of vegetation and buildings on urban land surface temperature. Int. J. Appl. Earth Obs. Geoinf. 2021, 95, 102265. [Google Scholar] [CrossRef]
Hou, J. Guerrilla Urbanism: Urban Design and the Practices of Resistance; Springer: Berlin/Heidelberg, Germany, 2020. [Google Scholar]
Lourenço, I.B.; Guimarães, L.F.; Alves, M.B.; Miguez, M.G. Land as a sustainable resource in city planning: The use of open spaces and drainage systems to structure environmental and urban needs. J. Clean. Prod. 2020, 276, 123096. [Google Scholar] [CrossRef]
Ibrahimy, R.; Mohmmand, M.A.; Elham, F.A. An evaluation of space use efficiency in residential houses, Kabul city. J. Res. Appl. Sci. Biotechnol. 2023, 2, 1–6. [Google Scholar] [CrossRef]
USGS. Landsat 8 (L8) Data Users Handbook; Earth Resources Observation and Science (EROS) Center: Sioux Falls, SD, USA, 2015.
Langford, R.L. Temporal merging of remote sensing data to enhance spectral regolith, lithological and alteration patterns for regional mineral exploration. Ore Geol. Rev. 2015, 68, 14–29. [Google Scholar] [CrossRef]
Stewart, I.D.; Oke, T.R. Local climate zones for urban temperature studies. Bull. Am. Meteorol. Soc. 2012, 93, 1879–1900. [Google Scholar] [CrossRef]
Bechtel, B.; Alexander, P.J.; Beck, C.; Böhner, J.; Brousse, O.; Ching, J.; Demuzere, M.; Fonte, C.; Gál, T.; Hidalgo, J.; et al. Generating WUDAPT Level 0 data–Current status of production and evaluation. Urban. Clim. 2019, 27, 24–45. [Google Scholar] [CrossRef]
ÇİLEK, M.Ü. A Vector-Based Mapping in GIS Environment to Classify Local Climate Zone. Çukurova Üniversitesi Mühendislik Fakültesi Derg. 2021, 36, 929–940. [Google Scholar] [CrossRef]
Zheng, Y.; Ren, C.; Xu, Y.; Wang, R.; Ho, J.; Lau, K.; Ng, E. GIS-based mapping of Local Climate Zone in the high-density city of Hong Kong. Urban. Clim. 2018, 24, 419–448. [Google Scholar] [CrossRef]
Pettorelli, N. The Normalized Difference Vegetation Index; Oxford University Press: New York, NY, USA, 2013. [Google Scholar]
Zha, Y.; Gao, J.; Ni, S. Use of normalized difference built-up index in automatically mapping urban areas from TM imagery. Int. J. Remote Sens. 2003, 24, 583–594. [Google Scholar] [CrossRef]
As-Syakur, A.R.; Adnyana, I.W.S.; Arthana, I.W.; Nuarsa, I.W. Enhanced built-up and bareness index (EBBI) for mapping built-up and bare land in an urban area. Remote Sens. 2012, 4, 2957–2970. [Google Scholar] [CrossRef]
Warner, T. Kernel-based texture in remote sensing image classification. Geogr. Compass 2011, 5, 781–798. [Google Scholar] [CrossRef]
Manyanya, T.; Teerlinck, J.; Somers, B.; Verbist, B.; Nethengwe, N. Sentinel-Based Adaptation of the Local Climate Zones Framework to a South African Context. Remote Sens. 2022, 14, 3594. [Google Scholar] [CrossRef]
Ndossi, M.I.; Avdan, U. Application of open source coding technologies in the production of land surface temperature (LST) maps from Landsat: A PyQGIS plugin. Remote Sens. 2016, 8, 413. [Google Scholar] [CrossRef]
Zhao, C. Linking the local climate zones and land surface temperature to investigate the surface urban heat island, a case study of San Antonio, Texas, US. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2018, 4, 277–283. [Google Scholar] [CrossRef]
Trinh, L.H.; Vu, D.T. Comparison of single-channel and split-window methods for estimating land surface temperature from Landsat 8 data. VNU J. Sci. Earth Environ. Sci. 2019, 35. [Google Scholar] [CrossRef]
Jimenez-Munoz, J.C.; Sobrino, J.A.; Skoković, D.; Mattar, C.; Cristobal, J. Land surface temperature retrieval methods from Landsat-8 thermal infrared sensor data. IEEE Geosci. Remote Sens. Lett. 2014, 11, 1840–1843. [Google Scholar] [CrossRef]
Wellek, S. Testing Statistical Hypotheses of Equivalence; Chapman and Hall/CRC: Boca Raton, FL, USA, 2002. [Google Scholar]
Mohajeri, K.; Mesgari, M.; Lee, A.S. When Statistical Significance Is Not Enough: Investigating Relevance, Practical Significance, and Statistical Significance. MIS Q. 2020, 44, 525. [Google Scholar] [CrossRef]
Wilkinson, M. Distinguishing between statistical significance and practical/clinical meaningfulness using statistical inference. Sports Med. 2014, 44, 295–301. [Google Scholar] [CrossRef] [PubMed]
Voogt, J.A.; Oke, T.R. Thermal remote sensing of urban climates. Remote Sens. Environ. 2003, 86, 370–384. [Google Scholar] [CrossRef]
Santamouris, M. On the energy impact of urban heat island and global warming on buildings. Energy Build. 2014, 82, 100–113. [Google Scholar] [CrossRef]
Gill, S.E.; Handley, J.F.; Ennos, A.R.; Pauleit, S. Adapting cities for climate change: The role of the green infrastructure. Built Environ. 2007, 33, 115–133. [Google Scholar] [CrossRef]
Gago, E.J.; Roldan, J.; Pacheco-Torres, R.; Ordóñez, J. The city and urban heat islands: A review of strategies to mitigate adverse effects. Renew. Sustain. Energy Rev. 2013, 25, 749–758. [Google Scholar] [CrossRef]
Montero, P.; Vilar, J.A. TSclust: An R package for time series clustering. J. Stat. Softw. 2015, 62, 1–43. [Google Scholar]
Charrad, M.; Ghazzali, N.; Boiteau, V.; Niknafs, A. NbClust: An R package for determining the relevant number of clusters in a data set. J. Stat. Softw. 2014, 61, 1–36. [Google Scholar] [CrossRef]
Wang, M.; Abrams, Z.B.; Kornblau, S.M.; Coombes, K.R. Thresher: Determining the number of clusters while removing outliers. BMC Bioinform. 2018, 19, 9. [Google Scholar] [CrossRef]
Charrad, M.; Ghazzali, N.; Boiteau, V.; Niknafs, A. NbClust Package. An Examination of Indices for Determining the Number of Clusters. 2012. Available online: https://cedric.cnam.fr/fichiers/art_2554.pdf (accessed on 13 April 2024).
Murtagh, F.; Contreras, P. Algorithms for hierarchical clustering: An overview. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2012, 2, 86–97. [Google Scholar] [CrossRef]
Sharma, M.; Kumar, C.J.; Deka, A. Land cover classification: A comparative analysis of clustering techniques using Sentinel-2 data. Int. J. Sustain. Agric. Manag. Inform. 2021, 7, 321–342. [Google Scholar] [CrossRef]
Wang, M.; Zhang, Y.-Y.; Min, F.; Deng, L.-P.; Gao, L. A two-stage density clustering algorithm. Soft Comput. 2020, 24, 17797–17819. [Google Scholar] [CrossRef]
Pelegrina, G.D.; Duarte, L.T.; Romano, J.M.T. Application of independent component analysis and TOPSIS to deal with dependent criteria in multicriteria decision problems. Expert. Syst. Appl. 2019, 122, 262–280. [Google Scholar] [CrossRef]
Tibshirani, R. High-Dimensional Regression: Ridge. Available online: https://www.stat.berkeley.edu/~ryantibs/statlearn-s24/lectures/ridge.pdf (accessed on 13 April 2024).
Zhu, S.; Xu, L.; Goodman, E.D. Hierarchical topology-based cluster representation for scalable evolutionary multiobjective clustering. IEEE Trans. Cybern. 2021, 52, 9846–9860. [Google Scholar] [CrossRef] [PubMed]
Bechtel, B.; Alexander, P.J.; Böhner, J.; Ching, J.; Conrad, O.; Feddema, J.; Mills, G.; See, L.; Stewart, I. Mapping local climate zones for a worldwide database of the form and function of cities. ISPRS Int. J. Geoinf. 2015, 4, 199–219. [Google Scholar] [CrossRef]
Peeters, M.J. Practical significance: Moving beyond statistical significance. Curr. Pharm. Teach. Learn. 2016, 8, 83–89. [Google Scholar] [CrossRef]
Rwanga, S.S.; Ndambuki, J.M. Accuracy Assessment of Land Use/Land Cover Classification Using Remote Sensing and GIS. Int. J. Geosci. 2017, 8, 611–622. [Google Scholar] [CrossRef]
Song, J.; Gao, S.; Zhu, Y.; Ma, C. A survey of remote sensing image classification based on CNNs. Big Earth Data 2019, 3, 232–254. [Google Scholar] [CrossRef]
Bechtel, B.; Demuzere, M.; Sismanidis, P.; Fenner, D.; Brousse, O.; Beck, C.; Van Coillie, F.; Conrad, O.; Keramitsoglou, I.; Middel, A.; et al. Quality of crowdsourced data on urban morphology—The human influence experiment (HUMINEX). Urban. Sci. 2017, 1, 15. [Google Scholar] [CrossRef]
Cho, H.; Lee, K.-S. Comparison between hyperspectral and multispectral images for the classification of coniferous species. Korean J. Remote Sens. 2014, 30, 25–36. [Google Scholar] [CrossRef]
Bechtel, B.; Demuzere, M.; Mills, G.; Zhan, W.; Sismanidis, P.; Small, C.; Voogt, J. SUHI analysis using Local Climate Zones—A comparison of 50 cities. Urban Clim. 2019, 28, 100451. [Google Scholar] [CrossRef]
Johnson, B.A.; Jozdani, S.E. Local climate zone (LCZ) map accuracy assessments should account for land cover physical characteristics that affect the local thermal environment. Remote Sens. 2019, 11, 2420. [Google Scholar] [CrossRef]
Zheng, Z.; Zhou, W.; Yan, J.; Qian, Y.; Wang, J.; Li, W. The higher, the cooler? Effects of building height on land surface temperatures in residential areas of Beijing. Phys. Chem. Earth Parts A/B/C 2019, 110, 149–156. [Google Scholar] [CrossRef]
Syafitri, R.A.W.D.; Pamungkas, A.; Santoso, E.B. Urban Form Factors that Play Important Roles on UHI Spatial-Temporal Pattern: A Case Study of East Surabaya, Indonesia. IOP Conf. Ser. Earth Environ. Sci. 2021, 764, 12030. [Google Scholar] [CrossRef]
Liu, H.; Huang, B.; Zhan, Q.; Gao, S.; Li, R.; Fan, Z. The influence of urban form on surface urban heat island and its planning implications: Evidence from 1288 urban clusters in China. Sustain Cities Soc 2021, 71, 102987. [Google Scholar] [CrossRef]
Guo, Z.; Zhang, Z.; Wu, X.; Wang, J.; Zhang, P.; Ma, D.; Liu, Y. Building shading affects the ecosystem service of urban green spaces: Carbon capture in street canyons. Ecol. Modell. 2020, 431, 109178. [Google Scholar] [CrossRef]
Vandamme, S.; Demuzere, M.; Verdonck, M.-L.; Zhang, Z.; Van Coillie, F. Revealing kunming’s (China) historical urban planning policies through local climate zones. Remote Sens. 2019, 11, 1731. [Google Scholar] [CrossRef]

Figure 1. Cape Town locality map, showing its position relative to the African continent.

Figure 2. Landsat 7 (ETM+) and Landsat 8 (OLI, TIRS) band frequencies [36].

Figure 3. LCZ framework typology showing the 10 built and 7 natural classes.

Figure 4. LiDAR grids for the heighted stack.

Figure 5. Methodological flow diagram for the extraction of LST from Landsat 7 (band 6) and Landsat 8 (band 10) satellite imagery using Planck’s function inversion in a single channel algorithm on Google Earth Engine.

Figure 6. Cape Town spring LCZ map with 5-pixel neighbourhood function (NH) (left), spring LST map extracted from Landsat 8 (OLI-TIRS) (right), and full set of LCZ and LST images is in Appendix A and Appendix B.

Figure 7. LST mean graphs per season (summer (S), autumn (A), winter (W), spring (S)).

Figure 8. Boxplot for each season of LST showing interquartile ranges, standard deviations and means of each LCZ.

Figure 9. Tukey HDS test plots showing pairs whose LST differences are not statistically significant (Red) and or practically significant (Blue) for spring (A,B), winter (C,D), autumn (E,F), and summer (G,H).

Figure 10. Confusion matrices for the 2020 images for spring (Top left), winter (Top Right), autumn (Bottom Left), and summer (Bottom Right).

Figure 11. LCZ classification for Spring 2020 with height included. Nested accuracy table of producer’s accuracy (PA).

Figure 12. Tukey HDS test plots showing pairs whose LST difference is NOT statistically significant (Red) and/or practically significant (Blue) for an LCZ classification with height data for spring 2020.

Figure 13. Dendrogram output for the SAAC, showing the different levels at which LCZ can be clustered based on the provided matrix of feature characteristics.

Figure 14. ADC output of the dendrogram showing temperature limits within each cluster, and the associated table showing how much of each LCZ was assigned to which cluster by Ward’s method. Spring (Top) and winter (Bottom), where the pi-chart within each cell represents the extent to which a certain cluster (temperature range) dominates a certain LCZ.

Figure 15. Tukey HDS test plots for the SAAC and ADC outputs with a confidence interval of 95% and a practical significance threshold of 1 °C.

Figure 16. Maps of LCZ clusters from the fully automated (left) and semi-automated (right) clustering algorithm outputs.

Table 1. Equations and constants for calculating the brightness temperature, spectral indices and emissivity needed for the LST extraction [49].

Equations

Input Parameters

T s e n = \frac{k 2}{l n (\frac{k 1}{L_{T O A λ}} + 1)}

L_{T O A λ} = g a i n * D N + b i a s

N D V I = \frac{B a n d 5 (N I R) - B a n d 4 (R e d)}{B a n d 5 (N I R) + B a n d 4 (R e d)}

ϵ = {ϵ_{v λ} ρ_{v} + ϵ}_{s λ} (1 - ρ_{v}) + C_{λ}

C_{λ} = (1 - ϵ_{v λ}) ϵ_{s λ} F^{'} (1 - ρ_{v})

ρ_{v} = {[\frac{N D V I - {N D V I}_{m i n}}{{N D V I}_{m a x} - {N D V I}_{m i n}}]}^{2}

k1 and k2 are Thermal infrared sensor calibration constants,

L_{T O A λ}

is the radiance.
Where DN is the digital number of a pixel, gain and bias are offset values found in the imagery metadata file
ε_υλ (vegetation emissivity) = 0.986 and ε_sλ (soil emissivity) = 0.966, C_λ (is the cavity effect), ρ_v is the vegetation fraction.
F′ is a constant with value 0.55

Table 2. Accuracy metrics for each season, based on the month in the middle of the season.

	Summer			Autumn			Winter			Spring
	Raw	SI	SI-NH	Raw	SI	SI-NH	Raw	SI	SI-NH	Raw	SI	SI-NH
OA	32	41	59	34	45	48	23	33	44	39	54	61
OA_U	24	31	51	16	30	41	17	32	39	24	42	51
OA_N	67	76	82	22	78	83	76	76	83	68	78	91
Kappa	26	33	55	31	36	44	20	30	40	31	45	58

Table 3. ANOVA output summary for the 2020 data of all four seasons.

Image	MS	F–Value	p–Value
Spring	40,358.51	4090.432	2 × 10¹⁶
Autumn	13,821.2	2669.465	2 × 10¹⁶
Winter	4883.25	873.2741	2 × 10¹⁶
Summer	39,061.95	3631.139	2 × 10¹⁶

Table 4. A summary of the outputs of the SAAC and ADC clusters for the Spring.

Cluster	Semi-Automated (SAAC)	Automated (ADC)
1	LCZ 9 and 14 (low veg and sparsely built)	LCZ 9 and 14 (low veg and sparsely built)
2	LCZ 7 and 10 (industrial-looking)	LCZ 13 (shrublands)
3	LCZ 2, 3 and 8 (impervious)	LCZ 2, 3, 8, 10 (impervious)
4	LCZ 11 and 13 (highly vegetated)	LCZ 11 (densely vegetated)
5	LCZ 17 (water bodies)	LCZ 17 (water bodies)
6	LCZ 5, 6 and 16 (open)	LCZ 5, 6, 7, 16 (open)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Manyanya, T.; Nethengwe, N.S.; Verbist, B.; Somers, B. Optimizing Local Climate Zones through Clustering for Surface Urban Heat Island Analysis in Building Height-Scarce Cities: A Cape Town Case Study. Climate 2024, 12, 142. https://doi.org/10.3390/cli12090142

AMA Style

Manyanya T, Nethengwe NS, Verbist B, Somers B. Optimizing Local Climate Zones through Clustering for Surface Urban Heat Island Analysis in Building Height-Scarce Cities: A Cape Town Case Study. Climate. 2024; 12(9):142. https://doi.org/10.3390/cli12090142

Chicago/Turabian Style

Manyanya, Tshilidzi, Nthaduleni Samuel Nethengwe, Bruno Verbist, and Ben Somers. 2024. "Optimizing Local Climate Zones through Clustering for Surface Urban Heat Island Analysis in Building Height-Scarce Cities: A Cape Town Case Study" Climate 12, no. 9: 142. https://doi.org/10.3390/cli12090142

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Optimizing Local Climate Zones through Clustering for Surface Urban Heat Island Analysis in Building Height-Scarce Cities: A Cape Town Case Study

Abstract

1. Introduction

2. Materials and Methods

2.1. Materials

2.1.1. Study Area

2.1.2. Data

2.2. Method

2.2.1. LCZ Classification

2.2.2. LST Extraction

2.2.3. Spatial Trends in LST and Thermal Variability within and between LCZ Classes

2.2.4. Statistical and Practically Significant Thermal Environment Distinction between LCZs

2.2.5. Local Climate Zone Optimisation for Statistical and Practical Significance

3. Results

3.1. LCZ Classification

3.2. Spatial Trends in LST and Thermal Variability within and between LCZ Classes

3.3. Statistical and Practically Significant Thermal Environment Distinction between LCZs

Building Height and Within-Cluster Thermal Environments

3.4. Local Climate Zone Optimisation for Statistically and Practically Distinct Thermal Environments

3.4.1. Semi-Automated Agglomerate Hierarchical Clustering (SAAC)

3.4.2. Automated Divisive Hierarchical Clustering (ADC)

3.4.3. Testing the SAAC and ADC Outputs for Thermal Environment Distinctness

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. LCZ Maps for All Seasons

Appendix B. LST Maps for All Seasons

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI