Luxembourg Land Cover

Data-Science Internship at WEO

What you see below is a 5-category 2018 Land Cover map of Luxembourg, at 10m resolution. Each color represents a category, and the model I developed uses this input as ground-truth.

U-Net for Semantic Segmentation

To train the model, I created a parametric U-Net model that takes satellite images as input. For each pixel it outputs the probabilities of belonging to each class.

Implementation on AWS

After modelling the architecture locally, with the help of my colleagues I deployed the model on AWS, enabling us to conduct training using SageMaker. Regarding the inputs preparation, I also created python scripts on Cloud9 to download and process data from S3. Therefore, everything up to the model evaluation can be carried out on the cloud.

Quantitative results

The most meaningful metric for evaluating various models is the average F1 score, given the overall distribution of the five categories in the ground truth map. I obtained the best results by training the model using both 2018 and 2019 satellite images while maintaining the same ground truth map from 2018. Although this approach may appear unconventional, I based it on the assumption that the landscape experiences minimal changes between two consecutive years. While there are some inevitable misclassifications in the 2019 satellite images, the advantage lies in nearly doubling the input data, with the vast majority of it being accurately classified. For instance, buildings, streets, and rivers typically exhibit minimal changes over a one-year period.

Confusion matrix & Scores - Average F1 Score: 80.3% (previous model's avg F1 score: 72%)

	Class 1/pred	Class 2/pred	Class 3/pred	Class 4/pred	Class 5/pred
Class 1 / true	185874	13112	176	23505	26333
Class 2 / true	2630	664735	13	9307	114957
Class 3 / true	406	21	7665	4639	1001
Class 4 / true	26570	12155	789	1448790	96448
Class 5 / true	32126	65638	413	86806	944211

Average F1 Score: 80.3% (previous model's avg F1 score: 72%)

Class 1: Precision: 75.07% - Recall: 74.65% - F1 score: 74.86% → Buildings / Other construction area

Class 2: Precision: 87.97% - Recall: 83.97% - F1 score: 85.92% → Bare soil / Seasonal herbaceous vegetation / Vineyards

Class 3: Precision: 84.64% - Recall: 55.82% - F1 score: 67.27% → Water

Class 4: Precision: 92.10% - Recall: 91.42% - F1 score: 91.76% → Tress / Bushes

Class 5: Precision: 79.82% - Recall: 83.62% - F1 score: 81.67% → Permanent herbaceous vegetation

Loss function

As displayed in the loss function of the validation dataset, the model shows very little overfitting.

Accuracy function (86.28%)

Qualitative results

Below you can see a qualitative evaluation of the model. On the left there is the satellite image (a 128px by 128px tile), in the center the ground-truth (taken from the land cover above), and on the right the model prediction.

Satellite - 10m resolution

Ground-Truth

Prediction

• Buildings

• Other constr. area

• Bare soil

• Seasonal vegetation

• Vineyards

• Water

• Trees

• Bushes

• Permanent vegetation

Business implications for WEO

Thanks to the model I created, my former colleagues at WEO can utilize the output probabilities as input to calculate the canopy growth for specific trees of their interest. This is accomplished by extrapolating the likelihood of each pixel being associated with the Tree/Bushes category. The resulting outcomes is displayed in this section.

Initial tree canopy

Growth of the tree canopy

Images like the one shown here represent a few of the outputs featured in the environmental reports produced by WEO. These analyses offer valuable insights that local governments can leverage for various objectives, including more effective budget allocation, enhanced environmental surveillance, and proactive risk management.

Skills acquired/improved and tools used in the project

Despite having spent only three months at WEO working on this project, I was able to develop a strong understanding of the basics of deep learning and remote sensing. Moreover, working with multiple services offered by AWS was another important added value to the overall experience. In summary, I have used the following tools:

TensorFlow

Numpy

GDAL

SageMaker

Cloud9

AirFlow

QGIS

European

Space Agency

Geoportal.lu

Enrico's contributions during his internship exceeded our expectations. Instead of mere optimization, he undertook the task of rewriting the model's codebase. Notably, this model outperformed its predecessor, showcasing Enrico's problem-solving skills. In addition to his technical skills, Enrico is a great team player. He is a fast learner and a hardworking individual who adds value to any team he joins.

Frankwin van Winsen

Head of Development - WEO

Frankwin's recommendation letter