Assignment 5#
Due: Wednesday Oct 29th at 11:59 pm ET
Goal#
Develop a pipeline to calculate time series of NDWI values from Sentinel-2 data using Dask.
Instructions#
You should submit this assignment to your existing geog213-assignments or geog313-assignments GitHub repository under a new directory named assignment-5/.
Include:
Scripts that contain the functions that you develop in this assignment.
An executed Jupyter notebook (
.ipynb) with all the outputs.A
Dockerfilethat builds a reproducible environment.A
README.mdwith instructions to reproduce the results of your work.
Tasks#
1) Develop Function(s) to Plot NDWI Time Series (60 pts)#
You should write a Python function (let’s call it main) that receives the following inputs:
A bbox in the form of a tuple
A start date for searching scenes from a STAC API
An end date for searching scenes from a STAC API
and plots the mean NDWI values vs time for all Sentinel-2 scenes returned from the search.
You should break your pipeline to multiple functions inside the main function as following:
A function to search and retrieve items from a STAC API. This function should return the collection of items from the search.
A function that receives the following as input
a STAC item collection
list of assets requested by the user
bbox for clipping the scenes
and returns a xarray object using stackstac with the requested assets stacked and clipped to the bbox.
Note You should use the argument
assetsinstackstac.stackto only stack specific assets from the item collection (check documentation here). This is a good practice to reduce the size of your dask array.
A function that receives a xarray object with bands needed for NDWI included, and returns the mean NDWI for each scene.
A function that receives the mean NDWI and time stamps for each scene, and plots NDWI vs time as point plot.
All of the steps above should be carried out using Dask lazy computation until the plot computation is executed.
2) Plot a Sample NDWI Time Series (10 pts)#
Now that you have prepared your pipline, create a Jupyter Notebook and do the following:
Start a local Dask cluster.
Import the
mainfunction you developed in the previous task.Plot mean NDWI for the following geometry between Jan 1st, 2020 and Dec 31st, 2024 using
main:
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"properties": {},
"geometry": {
"coordinates": [
[
[
37.48171911932283,
12.09646539241534
],
[
37.48171911932283,
11.962224220348077
],
[
37.656944179348955,
11.962224220348077
],
[
37.656944179348955,
12.09646539241534
],
[
37.48171911932283,
12.09646539241534
]
]
],
"type": "Polygon"
}
}
]
}
Commit all the code from both tasks to GitHub at this point.
3) Filter Cloudy Pixels (30 pts)#
In this task you should extend your pipeline from Task 1, and add an input option in the main function to allow the user to select whether they want to filter clouds when calculating NDWI. The default value for this should be set to True. After revising the main function, repeat Task 2 and in the same notebook and add new cells that would use this cloud filter to create a second plot with NDWI values that have been filtered for cloudy pixels.
To filter cloudy pixels, use the Scene Classification Layer (SCL) band from Sentinel-2. Checkout Table 2 here to learn about different values in the SCL layer. You need to exclude any pixel that has a SCL value of 3, 8, 9 or 10.
Commit your updated notebook, and other scripts to GitHub.