Explainable computer vision for cities
| Authors | |
|---|---|
| Supervisors | |
| Cosupervisors | |
| Award date | 23-04-2026 |
| ISBN |
|
| Number of pages | 163 |
| Organisations |
|
| Abstract |
This thesis studies how computer vision can be used with street view imagery to model socio-economic urban visual relationships within cities. Central to this lies the research question: What challenges must be addressed for computer vision–based models of socio-economic urban dynamics to evolve into robust, interpretable systems that municipalities can operationalize to create more equitable cities? First, we combine the current paradigm for Perceptive Visual Urban Analytics with explainability approaches to help us understand the relationship between visual elements present in street view imagery and housing prices. We then evaluate the interpretability and trustworthiness of the resulting explanations through an expert user-study. Building upon these results, we then focus on finding actionable elements in street view scenes; Visual elements that the municipality can act upon and thus change. We develop a self-supervised training method to detect urban change and use it to detect change throughout the city at scale. To evaluate whether the datasource of street view images is skewed within cities, we evaluate wether global street view databases provide uniform street view coverage within cities. Finally, by looking at typical storefront aesthetics, we explore whether visual data alone is enough to model socio-economic city dynamics effectively. We conclude that neither images or metadata alone, but rather a combination of the two is essential to model the nuances of urban dynamics. |
| Document type | PhD thesis |
| Language | English |
| Downloads | |
| Supplementary materials | |
| Permalink to this page | |
