This repository contains multiple projects developed for the Environmental Data Analytics course in the Master of Science in Data Science program at the University of Luxembourg.
The main focus of this repository is a LiDAR-based tree detection study in dense mixed forests, which serves as the final and most comprehensive project of the course. Earlier projects are also included as supporting work and are briefly described at the end of this README.
The primary goal of this project is to detect individual trees from UAV LiDAR point cloud data in dense mixed forests and to evaluate the performance of classical and deep learning–based approaches.
Two methods are studied and compared:
- Local Maxima Filter (LMF) for tree detection
- RandLA-Net, a deep learning model for point cloud segmentation
The project emphasizes practical performance, robustness, and limitations when applying deep learning models to real-world environmental data.
The dataset consists of UAV LiDAR point clouds and field inventory data collected in mixed forest plots in Perm Krai, Russia. It includes:
- Height-normalized LiDAR point clouds
- Ground-truth tree locations and species labels
- RGB orthophotos aligned with the LiDAR data
The dataset itself is not included in this repository.
Preprocessing
- Height normalization
- Noise filtering
- Cropping to remove boundary effects
- Feature scaling and label assignment
Local Maxima Filter
- KD-tree–based nearest neighbor search
- Detection of local height maxima as tree candidates
- Matching with ground-truth trees using spatial constraints
RandLA-Net
- Point-wise classification using a PyTorch implementation
- Custom DataLoader for large point clouds
- Random sampling during training and full-resolution inference during testing
Despite extensive tuning, RandLA-Net does not achieve competitive performance on this dataset. Potential causes and limitations are discussed in detail in the accompanying paper.
- LMF provides stable and interpretable results but generates false positives
- RandLA-Net fails to generalize effectively on this dataset
- Classical methods outperform deep learning for this specific task and data regime
Detailed metrics, figures, and qualitative comparisons are available in project_3/paper.pdf.
This project analyzes long-term climate trends in southern Luxembourg using the ERA5-Land reanalysis dataset. It includes time series analysis, correlation studies, anomaly detection, and linear regression to study temperature and precipitation trends.
This project explores bird occurrence data in Kenya using geospatial clustering and interpolation techniques. DBSCAN is used to identify clusters of bird sightings, and a species richness heatmap is generated to highlight biodiversity hotspots.
- Course: Environmental Data Analytics
- Program: Master of Science in Data Science
- Institution: University of Luxembourg
All projects were developed as part of graduate coursework and are shared for educational and reference purposes.
Anton Zaitsev
Othmane Mahfoud
Dylan Da Silva Moreira



