{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Xarray Interpolation, Groupby, Resample, Rolling, and Coarsen\n", "\n", "**Attribution**: This notebook is a revision of the [Xarray Interpolation, Groupby, Resample, Rolling, and Coarsen notebook](https://earth-env-data-science.github.io/lectures/xarray/xarray-part2.html) by Ryan Abernathey from [An Introduction to Earth and Environmental Data Science](https://earth-env-data-science.github.io/intro.html). Thanks to Aiyin Zhang for preparing this notebook. \n", "\n", "In this lesson, we cover some more advanced aspects of xarray." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "tags": [] }, "outputs": [], "source": [ "import numpy as np\n", "import xarray as xr\n", "from matplotlib import pyplot as plt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Interpolation\n", "\n", "In the previous lesson on Xarray, we learned how to select data based on its dimension coordinates and align data with dimension different coordinates.\n", "But what if we want to estimate the value of the data variables at _different coordinates_.\n", "This is where interpolation comes in." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
<xarray.DataArray (x: 11)>\n", "array([ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100])\n", "Coordinates:\n", " * x (x) int64 0 1 2 3 4 5 6 7 8 9 10
<xarray.DataArray ()>\n", "array(9)\n", "Coordinates:\n", " x int64 3
<xarray.DataArray ()>\n", "array(20.5)\n", "Coordinates:\n", " x float64 4.5
<xarray.Dataset>\n", "Dimensions: (lat: 89, lon: 180, time: 756)\n", "Coordinates:\n", " * lat (lat) float32 88.0 86.0 84.0 82.0 80.0 ... -82.0 -84.0 -86.0 -88.0\n", " * lon (lon) float32 0.0 2.0 4.0 6.0 8.0 ... 350.0 352.0 354.0 356.0 358.0\n", " * time (time) datetime64[ns] 1960-01-01 1960-02-01 ... 2022-12-01\n", "Data variables:\n", " sst (time, lat, lon) float32 ...\n", "Attributes: (12/39)\n", " climatology: Climatology is based on 1971-2000 SST, X...\n", " description: In situ data: ICOADS2.5 before 2007 and ...\n", " keywords_vocabulary: NASA Global Change Master Directory (GCM...\n", " keywords: Earth Science > Oceans > Ocean Temperatu...\n", " instrument: Conventional thermometers\n", " source_comment: SSTs were observed by conventional therm...\n", " ... ...\n", " comment: SSTs were observed by conventional therm...\n", " summary: ERSST.v5 is developed based on v4 after ...\n", " dataset_title: NOAA Extended Reconstructed SST V5\n", " _NCProperties: version=2,netcdf=4.6.3,hdf5=1.10.5\n", " data_modified: 2023-11-03\n", " DODS_EXTRA.Unlimited_Dimension: time
<xarray.DataArray 'time' (time: 756)>\n", "array(['1960-01-01T00:00:00.000000000', '1960-02-01T00:00:00.000000000',\n", " '1960-03-01T00:00:00.000000000', ..., '2022-10-01T00:00:00.000000000',\n", " '2022-11-01T00:00:00.000000000', '2022-12-01T00:00:00.000000000'],\n", " dtype='datetime64[ns]')\n", "Coordinates:\n", " * time (time) datetime64[ns] 1960-01-01 1960-02-01 ... 2022-12-01\n", "Attributes:\n", " long_name: Time\n", " delta_t: 0000-01-00 00:00:00\n", " avg_period: 0000-01-00 00:00:00\n", " prev_avg_period: 0000-00-07 00:00:00\n", " standard_name: time\n", " axis: T\n", " actual_range: [19723. 81722.]\n", " _ChunkSizes: 1
<xarray.DataArray 'month' (time: 756)>\n", "array([ 1, 2, 3, ..., 10, 11, 12])\n", "Coordinates:\n", " * time (time) datetime64[ns] 1960-01-01 1960-02-01 ... 2022-12-01\n", "Attributes:\n", " long_name: Time\n", " delta_t: 0000-01-00 00:00:00\n", " avg_period: 0000-01-00 00:00:00\n", " prev_avg_period: 0000-00-07 00:00:00\n", " standard_name: time\n", " axis: T\n", " actual_range: [19723. 81722.]\n", " _ChunkSizes: 1
<xarray.DataArray 'year' (time: 756)>\n", "array([1960, 1960, 1960, ..., 2022, 2022, 2022])\n", "Coordinates:\n", " * time (time) datetime64[ns] 1960-01-01 1960-02-01 ... 2022-12-01\n", "Attributes:\n", " long_name: Time\n", " delta_t: 0000-01-00 00:00:00\n", " avg_period: 0000-01-00 00:00:00\n", " prev_avg_period: 0000-00-07 00:00:00\n", " standard_name: time\n", " axis: T\n", " actual_range: [19723. 81722.]\n", " _ChunkSizes: 1
<xarray.DataArray 'sst' (time: 63, lat: 89, lon: 180)>\n", "[1009260 values with dtype=float32]\n", "Coordinates:\n", " * lat (lat) float32 88.0 86.0 84.0 82.0 80.0 ... -82.0 -84.0 -86.0 -88.0\n", " * lon (lon) float32 0.0 2.0 4.0 6.0 8.0 ... 350.0 352.0 354.0 356.0 358.0\n", " * time (time) datetime64[ns] 1960-01-01 1961-01-01 ... 2022-01-01\n", "Attributes:\n", " long_name: Monthly Means of Sea Surface Temperature\n", " units: degC\n", " var_desc: Sea Surface Temperature\n", " level_desc: Surface\n", " statistic: Mean\n", " dataset: NOAA Extended Reconstructed SST V5\n", " parent_stat: Individual Values\n", " actual_range: [-1.8 42.32636]\n", " valid_range: [-1.8 45. ]\n", " _ChunkSizes: [ 1 89 180]
<xarray.DataArray 'sst' (month: 12, lat: 89, lon: 180)>\n", "array([[[-1.8 , -1.8 , -1.8 , ..., -1.8 ,\n", " -1.8 , -1.8 ],\n", " [-1.8 , -1.8 , -1.8 , ..., -1.8 ,\n", " -1.8 , -1.8 ],\n", " [-1.8 , -1.8 , -1.8 , ..., -1.8 ,\n", " -1.8 , -1.8 ],\n", " ...,\n", " [ nan, nan, nan, ..., nan,\n", " nan, nan],\n", " [ nan, nan, nan, ..., nan,\n", " nan, nan],\n", " [ nan, nan, nan, ..., nan,\n", " nan, nan]],\n", "\n", " [[-1.8 , -1.8 , -1.8 , ..., -1.8 ,\n", " -1.8 , -1.8 ],\n", " [-1.8 , -1.8 , -1.8 , ..., -1.8 ,\n", " -1.8 , -1.8 ],\n", " [-1.8 , -1.8 , -1.8 , ..., -1.8 ,\n", " -1.8 , -1.8 ],\n", "...\n", " [ nan, nan, nan, ..., nan,\n", " nan, nan],\n", " [ nan, nan, nan, ..., nan,\n", " nan, nan],\n", " [ nan, nan, nan, ..., nan,\n", " nan, nan]],\n", "\n", " [[-1.7995342, -1.7996206, -1.7998532, ..., -1.7998041,\n", " -1.7996737, -1.7995361],\n", " [-1.7995963, -1.799773 , -1.8 , ..., -1.8 ,\n", " -1.7998328, -1.7996292],\n", " [-1.8 , -1.8 , -1.8 , ..., -1.8 ,\n", " -1.8 , -1.8 ],\n", " ...,\n", " [ nan, nan, nan, ..., nan,\n", " nan, nan],\n", " [ nan, nan, nan, ..., nan,\n", " nan, nan],\n", " [ nan, nan, nan, ..., nan,\n", " nan, nan]]], dtype=float32)\n", "Coordinates:\n", " * lat (lat) float32 88.0 86.0 84.0 82.0 80.0 ... -82.0 -84.0 -86.0 -88.0\n", " * lon (lon) float32 0.0 2.0 4.0 6.0 8.0 ... 350.0 352.0 354.0 356.0 358.0\n", " * month (month) int64 1 2 3 4 5 6 7 8 9 10 11 12\n", "Attributes:\n", " long_name: Monthly Means of Sea Surface Temperature\n", " units: degC\n", " var_desc: Sea Surface Temperature\n", " level_desc: Surface\n", " statistic: Mean\n", " dataset: NOAA Extended Reconstructed SST V5\n", " parent_stat: Individual Values\n", " actual_range: [-1.8 42.32636]\n", " valid_range: [-1.8 45. ]\n", " _ChunkSizes: [ 1 89 180]
<xarray.DataArray 'sst' (month: 12)>\n", "array([13.67924 , 13.787482, 13.784192, 13.703959, 13.662183, 13.736521,\n", " 13.950218, 14.123172, 14.008661, 13.715476, 13.52839 , 13.548249],\n", " dtype=float32)\n", "Coordinates:\n", " * month (month) int64 1 2 3 4 5 6 7 8 9 10 11 12
<xarray.DataArray 'sst' (month: 12, lat: 89, lon: 180)>\n", "array([[[-1.800001 , -1.800001 , -1.800001 , ..., -1.800001 ,\n", " -1.800001 , -1.800001 ],\n", " [-1.800001 , -1.800001 , -1.800001 , ..., -1.800001 ,\n", " -1.800001 , -1.800001 ],\n", " [-1.800001 , -1.800001 , -1.800001 , ..., -1.800001 ,\n", " -1.800001 , -1.800001 ],\n", " ...,\n", " [ nan, nan, nan, ..., nan,\n", " nan, nan],\n", " [ nan, nan, nan, ..., nan,\n", " nan, nan],\n", " [ nan, nan, nan, ..., nan,\n", " nan, nan]],\n", "\n", " [[-1.800001 , -1.800001 , -1.800001 , ..., -1.800001 ,\n", " -1.800001 , -1.800001 ],\n", " [-1.800001 , -1.800001 , -1.800001 , ..., -1.800001 ,\n", " -1.800001 , -1.800001 ],\n", " [-1.800001 , -1.800001 , -1.800001 , ..., -1.800001 ,\n", " -1.800001 , -1.800001 ],\n", "...\n", " [ nan, nan, nan, ..., nan,\n", " nan, nan],\n", " [ nan, nan, nan, ..., nan,\n", " nan, nan],\n", " [ nan, nan, nan, ..., nan,\n", " nan, nan]],\n", "\n", " [[-1.7995352, -1.7996216, -1.7998542, ..., -1.799805 ,\n", " -1.7996746, -1.7995371],\n", " [-1.7995974, -1.799774 , -1.800001 , ..., -1.800001 ,\n", " -1.7998339, -1.7996302],\n", " [-1.800001 , -1.800001 , -1.800001 , ..., -1.800001 ,\n", " -1.800001 , -1.800001 ],\n", " ...,\n", " [ nan, nan, nan, ..., nan,\n", " nan, nan],\n", " [ nan, nan, nan, ..., nan,\n", " nan, nan],\n", " [ nan, nan, nan, ..., nan,\n", " nan, nan]]], dtype=float32)\n", "Coordinates:\n", " * lat (lat) float32 88.0 86.0 84.0 82.0 80.0 ... -82.0 -84.0 -86.0 -88.0\n", " * lon (lon) float32 0.0 2.0 4.0 6.0 8.0 ... 350.0 352.0 354.0 356.0 358.0\n", " * month (month) int64 1 2 3 4 5 6 7 8 9 10 11 12
<xarray.Dataset>\n", "Dimensions: (lat: 89, lon: 180, time: 756)\n", "Coordinates:\n", " * lat (lat) float32 88.0 86.0 84.0 82.0 80.0 ... -82.0 -84.0 -86.0 -88.0\n", " * lon (lon) float32 0.0 2.0 4.0 6.0 8.0 ... 350.0 352.0 354.0 356.0 358.0\n", " * time (time) datetime64[ns] 1960-01-01 1960-02-01 ... 2022-12-01\n", "Data variables:\n", " sst (time, lat, lon) float32 1.073e-06 1.073e-06 1.073e-06 ... nan nan
<xarray.Dataset>\n", "Dimensions: (lat: 89, lon: 180, time: 756)\n", "Coordinates:\n", " * lat (lat) float32 88.0 86.0 84.0 82.0 80.0 ... -82.0 -84.0 -86.0 -88.0\n", " * lon (lon) float32 0.0 2.0 4.0 6.0 8.0 ... 350.0 352.0 354.0 356.0 358.0\n", " * time (time) datetime64[ns] 1960-01-01 1960-02-01 ... 2022-12-01\n", " month (time) int64 1 2 3 4 5 6 7 8 9 10 11 ... 2 3 4 5 6 7 8 9 10 11 12\n", "Data variables:\n", " sst (time, lat, lon) float32 0.0 0.0 0.0 0.0 0.0 ... nan nan nan nan
<xarray.Dataset>\n", "Dimensions: (time: 14, lat: 89, lon: 180)\n", "Coordinates:\n", " * lat (lat) float32 88.0 86.0 84.0 82.0 80.0 ... -82.0 -84.0 -86.0 -88.0\n", " * lon (lon) float32 0.0 2.0 4.0 6.0 8.0 ... 350.0 352.0 354.0 356.0 358.0\n", " * time (time) datetime64[ns] 1960-12-31 1965-12-31 ... 2025-12-31\n", "Data variables:\n", " sst (time, lat, lon) float32 -0.0005343 -0.000517 ... nan nan
<xarray.Dataset>\n", "Dimensions: (lat: 89, lon: 180, time: 756)\n", "Coordinates:\n", " * lat (lat) float32 88.0 86.0 84.0 82.0 80.0 ... -82.0 -84.0 -86.0 -88.0\n", " * lon (lon) float32 0.0 2.0 4.0 6.0 8.0 ... 350.0 352.0 354.0 356.0 358.0\n", " * time (time) datetime64[ns] 1960-01-01 1960-02-01 ... 2022-12-01\n", " month (time) int64 1 2 3 4 5 6 7 8 9 10 11 ... 2 3 4 5 6 7 8 9 10 11 12\n", "Data variables:\n", " sst (time, lat, lon) float32 nan nan nan nan nan ... nan nan nan nan
<xarray.Dataset>\n", "Dimensions: (time: 63, lat: 89, lon: 180)\n", "Coordinates:\n", " * lat (lat) float32 88.0 86.0 84.0 82.0 80.0 ... -82.0 -84.0 -86.0 -88.0\n", " * lon (lon) float32 0.0 2.0 4.0 6.0 8.0 ... 350.0 352.0 354.0 356.0 358.0\n", " * time (time) datetime64[ns] 1960-06-16T08:00:00 ... 2022-06-16T12:00:00\n", "Data variables:\n", " sst (time, lat, lon) float32 -1.8 -1.8 -1.8 -1.8 ... nan nan nan nan\n", "Attributes: (12/39)\n", " climatology: Climatology is based on 1971-2000 SST, X...\n", " description: In situ data: ICOADS2.5 before 2007 and ...\n", " keywords_vocabulary: NASA Global Change Master Directory (GCM...\n", " keywords: Earth Science > Oceans > Ocean Temperatu...\n", " instrument: Conventional thermometers\n", " source_comment: SSTs were observed by conventional therm...\n", " ... ...\n", " comment: SSTs were observed by conventional therm...\n", " summary: ERSST.v5 is developed based on v4 after ...\n", " dataset_title: NOAA Extended Reconstructed SST V5\n", " _NCProperties: version=2,netcdf=4.6.3,hdf5=1.10.5\n", " data_modified: 2023-11-03\n", " DODS_EXTRA.Unlimited_Dimension: time
<xarray.Dataset>\n", "Dimensions: (Y: 180, Z: 33, X: 360)\n", "Coordinates:\n", " * Y (Y) float32 -89.5 -88.5 -87.5 -86.5 -85.5 ... 86.5 87.5 88.5 89.5\n", " * Z (Z) float32 0.0 10.0 20.0 30.0 50.0 ... 4e+03 4.5e+03 5e+03 5.5e+03\n", " * X (X) float32 0.5 1.5 2.5 3.5 4.5 ... 355.5 356.5 357.5 358.5 359.5\n", "Data variables:\n", " basin (Z, Y, X) float32 ...\n", "Attributes:\n", " Conventions: IRIDL
<xarray.Dataset>\n", "Dimensions: (lat: 180, Z: 33, lon: 360)\n", "Coordinates:\n", " * lat (lat) float32 -89.5 -88.5 -87.5 -86.5 -85.5 ... 86.5 87.5 88.5 89.5\n", " * Z (Z) float32 0.0 10.0 20.0 30.0 50.0 ... 4e+03 4.5e+03 5e+03 5.5e+03\n", " * lon (lon) float32 0.5 1.5 2.5 3.5 4.5 ... 355.5 356.5 357.5 358.5 359.5\n", "Data variables:\n", " basin (Z, lat, lon) float32 ...\n", "Attributes:\n", " Conventions: IRIDL
<xarray.DataArray 'basin' (lat: 180, lon: 360)>\n", "[64800 values with dtype=float32]\n", "Coordinates:\n", " * lat (lat) float32 -89.5 -88.5 -87.5 -86.5 -85.5 ... 86.5 87.5 88.5 89.5\n", " Z float32 0.0\n", " * lon (lon) float32 0.5 1.5 2.5 3.5 4.5 ... 355.5 356.5 357.5 358.5 359.5\n", "Attributes:\n", " long_name: basin code\n", " units: ids\n", " scale_max: 58\n", " CLIST: Atlantic Ocean\\nPacific Ocean \\nIndian Ocean\\nMediterranean S...\n", " valid_min: 1\n", " valid_max: 58\n", " scale_min: 1
<xarray.DataArray 'sst' (time: 756, basin: 14)>\n", "array([[-1.8 , -1.8 , 23.455315 , ..., -1.8 ,\n", " 3.3971915 , 24.182198 ],\n", " [-1.8 , -1.8 , 23.722523 , ..., -1.8 ,\n", " 0.03573781, 24.59657 ],\n", " [-1.8 , -1.8 , 24.601315 , ..., -1.8 ,\n", " -0.26487017, 26.234186 ],\n", " ...,\n", " [ 0.89445347, 4.685296 , 29.049557 , ..., 8.882076 ,\n", " 16.515127 , 29.450462 ],\n", " [-0.31460398, 1.8985674 , 27.785666 , ..., 3.4794273 ,\n", " 11.925127 , 27.901617 ],\n", " [-1.8 , -0.24241269, 26.120224 , ..., 1.3552847 ,\n", " 7.9607453 , 25.901285 ]], dtype=float32)\n", "Coordinates:\n", " * time (time) datetime64[ns] 1960-01-01 1960-02-01 ... 2022-12-01\n", " * basin (basin) float32 1.0 2.0 3.0 4.0 5.0 ... 10.0 11.0 12.0 53.0 56.0\n", "Attributes:\n", " long_name: Monthly Means of Sea Surface Temperature\n", " units: degC\n", " var_desc: Sea Surface Temperature\n", " level_desc: Surface\n", " statistic: Mean\n", " dataset: NOAA Extended Reconstructed SST V5\n", " parent_stat: Individual Values\n", " actual_range: [-1.8 42.32636]\n", " valid_range: [-1.8 45. ]\n", " _ChunkSizes: [ 1 89 180]
<xarray.DataArray 'sst' (time: 756, basin: 14)>\n", "array([[18.585499 , 20.75755 , 21.572077 , ..., 6.238062 , 6.889794 ,\n", " 26.499819 ],\n", " [18.705072 , 20.816761 , 21.902283 , ..., 4.8877654, 5.44638 ,\n", " 26.57709 ],\n", " [18.845848 , 20.865032 , 22.031416 , ..., 4.686406 , 5.5322194,\n", " 27.90856 ],\n", " ...,\n", " [20.133 , 21.700815 , 20.31083 , ..., 17.463427 , 18.683998 ,\n", " 29.5153 ],\n", " [19.80138 , 21.430943 , 20.964071 , ..., 13.358289 , 14.617571 ,\n", " 28.847633 ],\n", " [19.636013 , 21.297836 , 21.741606 , ..., 10.26373 , 11.0815325,\n", " 27.899845 ]], dtype=float32)\n", "Coordinates:\n", " * time (time) datetime64[ns] 1960-01-01 1960-02-01 ... 2022-12-01\n", " Z float32 0.0\n", " * basin (basin) float32 1.0 2.0 3.0 4.0 5.0 ... 10.0 11.0 12.0 53.0 56.0\n", "Attributes:\n", " long_name: Monthly Means of Sea Surface Temperature\n", " units: degC\n", " var_desc: Sea Surface Temperature\n", " level_desc: Surface\n", " statistic: Mean\n", " dataset: NOAA Extended Reconstructed SST V5\n", " parent_stat: Individual Values\n", " actual_range: [-1.8 42.32636]\n", " valid_range: [-1.8 45. ]\n", " _ChunkSizes: [ 1 89 180]
\n", " | Z | \n", "sst | \n", "
---|---|---|
basin | \n", "\n", " | \n", " |
1.0 | \n", "0.0 | \n", "19.317692 | \n", "
2.0 | \n", "0.0 | \n", "21.204735 | \n", "
3.0 | \n", "0.0 | \n", "21.147755 | \n", "
4.0 | \n", "0.0 | \n", "19.902565 | \n", "
5.0 | \n", "0.0 | \n", "8.199746 | \n", "
6.0 | \n", "0.0 | \n", "15.138650 | \n", "
7.0 | \n", "0.0 | \n", "28.522148 | \n", "
8.0 | \n", "0.0 | \n", "26.654783 | \n", "
9.0 | \n", "0.0 | \n", "0.345633 | \n", "
10.0 | \n", "0.0 | \n", "1.550839 | \n", "
11.0 | \n", "0.0 | \n", "-0.799598 | \n", "
12.0 | \n", "0.0 | \n", "12.162644 | \n", "
53.0 | \n", "0.0 | \n", "14.433341 | \n", "
56.0 | \n", "0.0 | \n", "28.495367 | \n", "