Spatially-Aware Domain Adaptation for Semantic Segmentation of Urban Scenes

Yong-Xiang Lin, Daniel Stanley Tan, Wen-Huang Cheng, Yung-Yao Chen, Kai-Lung Hua

Research output: Chapter in Book/Report/Conference proceedingConference Article in proceedingAcademicpeer-review


It is very expensive and time consuming to collect a large enough dataset with pixel-level annotations to train a semantic segmentation model. Synthetic datasets are common alternatives for training segmentation models, however models trained on synthetic data do not necessarily perform well on real world images due to the domain shift problem. Domain adaptation techniques address this problem by leveraging on adversarial training to align features. Prior works have mostly performed global feature alignment. They do not consider the positions of objects. However, objects in urban scenes are highly correlated with their spatial locations. For example, the sky will always appear on top while cars will usually appear in the middle of the image. Based on this insight, we propose a spatial-aware discriminator that accounts for the spatial prior on the objects in order to improve the feature alignment. We demonstrate in our experiments that our model outperforms several state-of-the-art baselines in terms of mean intersection over union (mIoU).
Original languageEnglish
Title of host publication2019 IEEE International Conference on Image Processing (ICIP)
Number of pages5
ISBN (Electronic)978-1-5386-6249-6
ISBN (Print)978-1-5386-6250-2
Publication statusPublished - Sept 2019
Externally publishedYes
EventIEEE International Conference on Image Processing - Taipei, Taiwan, Province of China
Duration: 22 Sept 201925 Sept 2019


ConferenceIEEE International Conference on Image Processing
Abbreviated titleICIP 2019
Country/TerritoryTaiwan, Province of China
Internet address


  • Domain adaptation
  • Semantic segmentation
  • Spatial Structure


Dive into the research topics of 'Spatially-Aware Domain Adaptation for Semantic Segmentation of Urban Scenes'. Together they form a unique fingerprint.

Cite this