U-Net Ensemble for Enhanced Semantic Segmentation in Remote Sensing Imagery
Journal
Remote Sensing
Date Issued
2024-06-08
Author(s)
Abstract
Semantic segmentation of remote sensing imagery stands as a fundamental task within the
domains of both remote sensing and computer vision. Its objective is to generate a comprehensive
pixel-wise segmentation map of an image, assigning a specific label to each pixel. This facilitates
in-depth analysis and comprehension of the Earth’s surface. In this paper, we propose an approach
for enhancing semantic segmentation performance by employing an ensemble of U-Net models with
three different backbone networks: Multi-Axis Vision Transformer, ConvFormer, and EfficientNet.
The final segmentation maps are generated through a geometric mean ensemble method, leveraging
the diverse representations learned by each backbone network. The effectiveness of the base U-Net
models and the proposed ensemble is evaluated on multiple datasets commonly used for semantic
segmentation tasks in remote sensing imagery, including LandCover.ai, LoveDA, INRIA, UAVid, and
ISPRS Potsdam datasets. Our experimental results demonstrate that the proposed approach achieves
state-of-the-art performance, showcasing its effectiveness and robustness in accurately capturing the
semantic information embedded within remote sensing images.
domains of both remote sensing and computer vision. Its objective is to generate a comprehensive
pixel-wise segmentation map of an image, assigning a specific label to each pixel. This facilitates
in-depth analysis and comprehension of the Earth’s surface. In this paper, we propose an approach
for enhancing semantic segmentation performance by employing an ensemble of U-Net models with
three different backbone networks: Multi-Axis Vision Transformer, ConvFormer, and EfficientNet.
The final segmentation maps are generated through a geometric mean ensemble method, leveraging
the diverse representations learned by each backbone network. The effectiveness of the base U-Net
models and the proposed ensemble is evaluated on multiple datasets commonly used for semantic
segmentation tasks in remote sensing imagery, including LandCover.ai, LoveDA, INRIA, UAVid, and
ISPRS Potsdam datasets. Our experimental results demonstrate that the proposed approach achieves
state-of-the-art performance, showcasing its effectiveness and robustness in accurately capturing the
semantic information embedded within remote sensing images.
Subjects
File(s)![Thumbnail Image]()
Loading...
Name
remotesensing-16-02077.pdf
Size
3.25 MB
Format
Adobe PDF
Checksum
(MD5):fffb4511a32cf1b471b52e678bc9d32a
