Luxembourg Land Cover
Data-Science Internship at WEO
What you see below is a 5-category 2018 Land Cover map of Luxembourg, at 10m resolution. Each color represents a category, and the model I developed uses this input as ground-truth.

U-Net for Semantic Segmentation
To train the model, I created a parametric U-Net model that takes satellite images as input. For each pixel it outputs the probabilities of belonging to each class.

Implementation on AWS
After modelling the architecture locally, with the help of my colleagues I deployed the model on AWS, enabling us to conduct training using SageMaker. Regarding the inputs preparation, I also created python scripts on Cloud9 to download and process data from S3. Therefore, everything up to the model evaluation can be carried out on the cloud.
Quantitative results
The most meaningful metric for evaluating various models is the average F1 score, given the overall distribution of the five categories in the ground truth map. I obtained the best results by training the model using both 2018 and 2019 satellite images while maintaining the same ground truth map from 2018. Although this approach may appear unconventional, I based it on the assumption that the landscape experiences minimal changes between two consecutive years. While there are some inevitable misclassifications in the 2019 satellite images, the advantage lies in nearly doubling the input data, with the vast majority of it being accurately classified. For instance, buildings, streets, and rivers typically exhibit minimal changes over a one-year period.
Confusion matrix & Scores - Average F1 Score: 80.3% (previous model's avg F1 score: 72%)
Class 1/pred | Class 2/pred | Class 3/pred | Class 4/pred | Class 5/pred | |
---|---|---|---|---|---|
Class 1 / true | 185874 | 13112 | 176 | 23505 | 26333 |
Class 2 / true | 2630 | 664735 | 13 | 9307 | 114957 |
Class 3 / true | 406 | 21 | 7665 | 4639 | 1001 |
Class 4 / true | 26570 | 12155 | 789 | 1448790 | 96448 |
Class 5 / true | 32126 | 65638 | 413 | 86806 | 944211 |
Average F1 Score: 80.3% (previous model's avg F1 score: 72%)
Class 1: Precision: 75.07% - Recall: 74.65% - F1 score: 74.86% → Buildings / Other construction area
Class 2: Precision: 87.97% - Recall: 83.97% - F1 score: 85.92% → Bare soil / Seasonal herbaceous vegetation / Vineyards
Class 3: Precision: 84.64% - Recall: 55.82% - F1 score: 67.27% → Water
Class 4: Precision: 92.10% - Recall: 91.42% - F1 score: 91.76% → Tress / Bushes
Class 5: Precision: 79.82% - Recall: 83.62% - F1 score: 81.67% → Permanent herbaceous vegetation
Loss function

As displayed in the loss function of the validation dataset, the model shows very little overfitting.
Accuracy function (86.28%)

Qualitative results
Below you can see a qualitative evaluation of the model. On the left there is the satellite image (a 128px by 128px tile), in the center the ground-truth (taken from the land cover above), and on the right the model prediction.
Satellite - 10m resolution

Ground-Truth

Prediction


• Buildings
• Other constr. area
• Bare soil
• Seasonal vegetation
• Vineyards
• Water
• Trees
• Bushes
• Permanent vegetation
Business implications for WEO
Thanks to the model I created, my former colleagues at WEO can utilize the output probabilities as input to calculate the canopy growth for specific trees of their interest. This is accomplished by extrapolating the likelihood of each pixel being associated with the Tree/Bushes category. The resulting outcomes is displayed in this section.
Initial tree canopy
Growth of the tree canopy
Images like the one shown here represent a few of the outputs featured in the environmental reports produced by WEO. These analyses offer valuable insights that local governments can leverage for various objectives, including more effective budget allocation, enhanced environmental surveillance, and proactive risk management.

Skills acquired/improved and tools used in the project
Despite having spent only three months at WEO working on this project, I was able to develop a strong understanding of the basics of deep learning and remote sensing. Moreover, working with multiple services offered by AWS was another important added value to the overall experience. In summary, I have used the following tools:


TensorFlow

Numpy

GDAL


SageMaker

Cloud9

S3

AirFlow


QGIS

European
Space Agency

Geoportal.lu



Enrico's contributions during his internship exceeded our expectations. Instead of mere optimization, he undertook the task of rewriting the model's codebase. Notably, this model outperformed its predecessor, showcasing Enrico's problem-solving skills. In addition to his technical skills, Enrico is a great team player. He is a fast learner and a hardworking individual who adds value to any team he joins.


Frankwin van Winsen
Head of Development - WEO