Assignment 6#

Due: Monday Nov 4th at 11:59 pm ET

In this assignment, you will use Dask, and stackstac to retrieve time series of NDWI values from Sentinel-2 imagery.

You should submit this assignment to your existing geog213-assignments or geog313-assignments GitHub repository under a new directory named assignment-6.

Create a Dockerfile with all required packages to complete the task, and following best practices covered in the class. Include a README in your assignment-6 directory with all the steps a user needs to follow to reproduce your results.

You should write a Python function (let’s call it main) that receives the following inputs:

  1. A bbox in the form of a tuple

  2. A start date for searching scenes from a STAC API

  3. An end date for searching scenes from a STAC API

and plots the mean NDWI values vs time for all Sentinel-2 scenes returned from the search.

You should break your pipeline to multiple functions inside the main function as following:

  1. A function to search and retrieve items from a STAC API. This function should return the collection of items from the search.

  2. A function that receives the following as input

  • a STAC item collection

  • list of assets requested by the user

  • bbox for clipping the scenes

and returns a xarray object using stackstac with the requested assets stacked and clipped to the bbox.

Note You can use the argument assets in stackstac.stack to only stack specific assets from the item collection (check documentation here). This is a good practice to reduce the size of your dask array.

  1. A function that receives an xarray object with bands needed for NDWI included, and returns the mean NDWI for each scene.

  2. A function that receives the mean NDWI and time stamps for each scene, and plots NDWI vs time as point plot.

All of the steps above should be carried out using Dask lazy computation until the plot computation is executed.

Now that you have prepared your pipline, create a Jupyter Notebook and do the following:

  1. Start a local Dask cluster.

  2. Import the main function you have developed.

  3. Plot mean NDWI for the following bbox between Jan 1st, 2017 and Dec 31st, 2023 using main:

bbox = (32.16033157094111, 22.852984440390316, 33.288140399555346, 23.806193394664234)

[Optional] Filter NDWI Time Series (Extra 3 points)#

We know that some pixels from our previous NDWI calculation will be cloudy, and have an incorrect NDWI value. To improve your NDWI plot, create a new function that masks cloudy pixels using the Scene Classification Layer (SCL) band from Sentinel-2. (checkout Table 6 here to learn about different values in the SCL layer. You need to exclude any pixel that has a SCL value of 3, 8, 9 or 10)