Skip to content

FanZhang0830/chl_prj

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

This project develops a machine learning model to estimate chlorophyll-a concentration using Sentinel-2 satellite imagery combined with in-situ measurement data. The workflow includes data preprocessing, model training and validation, and spatial prediction across multiple time periods.

Data Cleaning

Data cleaning is the essential first step of this project. Due to data-sharing restrictions and an unclear data disclaimer of raw data, only the final cleaned dataset is included in the repository under the data/ folder. Once data-sharing permissions are clarified, the detailed cleaning procedures and scripts will also be made publicly available. Jupyter notebook data_preview.ipynb in the folder notebooks is used to filter out what data can be used to train the model. It won't affect the final result, just to preview the data points and the map.

Model Development

A Random Forest (RF) model was developed using in-situ measurements collected on 2023-07-06 and the corresponding Sentinel-2 imagery acquired on the same date. To make the model training process clearer, the in-situ measurement (ch_train_test_0706.csv) and Sentinel-2 image (raw_masked_image_0706.tif) that used to train and test the model are stored locally in the folder data All source code related to model training and validation is available in the src/ directory.

Model Application

After training and testing, the Random Forest model was applied to Sentinel-2 images acquired from July to October 2023 to estimate spatial and temporal variations in chlorophyll-a concentration. To avoid download large datasets locally, Python API geemap is used. The whole process of how to apply the model is shown in the jupyter notebook stored in the S2_image_apply_model.ipynb in the notebooks folder. The module that used to facilitate finalize model application is stored in class_prediction.py in the src folder

Results

The predicted chlorophyll-a concentrations can be visualized using the Jupyter Notebook provided in the notebooks/ folder. The module being used is stored in class_prediction.py in the src folder. These visualizations demonstrate the spatial patterns and temporal dynamics of chlorophyll-a across the study area.

Data Disclaimer

The data and results presented in this repository are provided for research and educational purposes only.

About

ML and DL models for chlorophyll-a estimation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published