Spatial-Elevation Guided Hyperspectral Representation Learning for Complex Urban Surface Mapping
Keywords:
Hyperspectral imaging, spatial-elevation fusion, representation learning, urban surface mapping, remote sensing, deep learning, multi-modal data integration, system architecture, fairness, sustainabilityAbstract
The accurate mapping of complex urban surfaces remains a fundamental challenge for remote sensing and Earth observation systems, particularly when high spectral resolution is combined with three-dimensional structural information. Hyperspectral imaging provides rich spectral signatures that can distinguish materials with subtle differences, yet the spatial heterogeneity and elevation variability of urban environments often lead to confusion in classification. This paper proposes a spatial-elevation guided hyperspectral representation learning framework that systematically integrates spectral, spatial, and elevation modalities through a multi-stream architecture with cross-modal attention mechanisms. The framework is designed to address key structural trade-offs between spectral fidelity and geometric detail, between computational efficiency and representational capacity, and between local feature extraction and global contextual understanding. We discuss the underlying engineering principles of the proposed system, including the design of elevation-aware convolutional modules, the deployment of hierarchical fusion strategies, and the governance of training stability under limited labeled samples. Through a series of case illustrations on benchmark urban datasets, we demonstrate that the spatial-elevation guidance significantly improves classification accuracy for rare and spectrally similar surface materials, such as roofing types, asphalt conditions, and vegetation subclasses. Furthermore, we examine the broader socio-technical implications of such advanced mapping systems, including fairness in resource allocation for urban planning, robustness against sensor noise and seasonal variations, and the policy challenges of integrating high-resolution urban maps into municipal governance. This work contributes both a technically rigorous representation learning approach and a critical reflection on the sustainable deployment of deep remote sensing systems in complex urban infrastructures.
References
1. Ghamisi, P., Plaza, J., Chen, Y., Li, J., & Plaza, A. J. (2017). Advanced spectral classifiers for hyperspectral images: A review. IEEE Geoscience and Remote Sensing Magazine, 5(1), 8-32.
2. Yokoya, N., & Iwasaki, A. (2015). Object-based classification of urban land cover using hyperspectral and LiDAR data. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 8(7), 3466-3478.
3. Liao, W., Bellens, R., Pižurica, A., Gautama, S., & Philips, W. (2015). Graph-based feature fusion of hyperspectral and LiDAR data for urban land cover classification. IEEE Geoscience and Remote Sensing Letters, 12(8), 1635-1639.
4. Chen, Y., Li, C., Ghamisi, P., Jia, X., & Gu, Y. (2017). Deep fusion of hyperspectral and LiDAR data for land cover classification. International Journal of Remote Sensing, 38(23), 6797-6818.
5. Hu, W., Huang, Y., Wei, L., Zhang, F., & Li, H. (2015). Deep convolutional neural networks for hyperspectral image classification. Journal of Sensors, 2015, 258619.
6. Li, Y., Zhang, H., & Shen, Q. (2017). Spectral–spatial classification of hyperspectral imagery with 3D convolutional neural network. Remote Sensing, 9(1), 67.
7. Rasti, B., Ghamisi, P., & Gloaguen, R. (2017). Hyperspectral and LiDAR fusion using deep learning: A review. IEEE Geoscience and Remote Sensing Magazine, 5(4), 46-66.
8. Dalponte, M., Bruzzone, L., & Gianelle, D. (2012). A system for the estimation of single-tree stem volume using LiDAR and hyperspectral data. IEEE Transactions on Geoscience and Remote Sensing, 50(12), 5117-5129.
9. Hong, D., Gao, L., Yokoya, N., Yao, J., Chanussot, J., Du, Q., & Zhang, B. (2019). More diverse means better: Multimodal deep learning meets remote-sensing imagery classification. IEEE Transactions on Geoscience and Remote Sensing, 59(5), 4340-4354.
10. Mou, L., Ghamisi, P., & Zhu, X. X. (2018). Unsupervised spectral–spatial feature learning via deep residual Conv–Deconv network for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing, 56(1), 391-406.
11. Xu, X., Li, J., & Plaza, A. (2020). Fusion of hyperspectral and LiDAR data using attention-based convolutional neural networks for land cover classification. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 13, 854-864.
12. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770-778.
13. Yang, J. X., Wang, J., Li, Z., Sui, C., Long, Z., & Zhou, J. (2025). HSLiNets: Evaluating Band Ordering Strategies in Hyperspectral and LiDAR Fusion. IEEE Geoscience and Remote Sensing Letters.
14. Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. International Conference on Learning Representations.
15. Zhong, Z., Zheng, L., Kang, G., Li, S., & Yang, Y. (2017). Random erasing data augmentation. arXiv preprint arXiv:1708.04896.
16. Feng, R., Zhong, Y., & Zhang, L. (2020). A spectral-spatial-temporal fusion approach for urban land cover classification using hyperspectral and LiDAR data. ISPRS Journal of Photogrammetry and Remote Sensing, 168, 273-287.
17. Sun, B., Feng, J., & Saenko, K. (2016). Correlation alignment for unsupervised domain adaptation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 955-963.
18. Buolamwini, J., & Gebru, T. (2018). Gender shades: Intersectional accuracy disparities in commercial gender classification. Proceedings of the Conference on Fairness, Accountability, and Transparency, 77-91.
19. Audebert, N., Le Saux, B., & Lefèvre, S. (2018). Beyond RGB: Very high resolution urban remote sensing with multimodal deep networks. ISPRS Journal of Photogrammetry and Remote Sensing, 140, 20-32.
20. Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-CAM: Visual explanations from deep networks via gradient-based localization. Proceedings of the IEEE International Conference on Computer Vision, 618-626.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Computer Science and Engineering Transactions

This work is licensed under a Creative Commons Attribution 4.0 International License.
This article is published under the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.



