知识分享 – 气象常用库 | xarray入门用法总结

1.导入xarray库,简写为xr

import xarray as xr
import numpy as np
import pandas as pd

D:\Anaconda3\lib\site-packages\xarray\core\merge.py:17: FutureWarning: The Panel class is removed from pandas. Accessing it from the top-level namespace will also be removed in the next version
  PANDAS_TYPES = (pd.Series, pd.DataFrame, pd.Panel)

一、创建xarray对象

2.使用给出的Series,创建一个DataArray

s = pd.Series([5,6,7,8])
s

0    5
1    6
2    7
3    8
dtype: int64

da = xr.DataArray.from_series(s)
da


array([5, 6, 7, 8], dtype=int64)
Coordinates:
  * index    (index) int64 0 1 2 3

3.使用给出的MultiIndexSeries,创建一个DataArray

lat = [5,6,7,8]
lon = [9,10,11,12]
idx = pd.MultiIndex.from_arrays(arrays=[lat,lon], names=["lat","lon"])
s = pd.Series(data=[1,2,3,4], index=idx)
s

lat  lon
5    9      1
6    10     2
7    11     3
8    12     4
dtype: int64

da = xr.DataArray.from_series(s)
da

<xarray.dataarray (lat: 4, lon: 4)>
array([[ 1., nan, nan, nan],
       [nan,  2., nan, nan],
       [nan, nan,  3., nan],
       [nan, nan, nan,  4.]])
Coordinates:
  * lat      (lat) int64 5 6 7 8
  * lon      (lon) int64 9 10 11 12</xarray.dataarray (lat: 4, lon: 4)>

4.使用给出的DataFrame,创建一个DataArray

df = pd.DataFrame([5,6,7,8])
df

0

0

5

1

6

2

7

3

8

da = xr.DataArray(df)
da

<xarray.dataarray (dim_0: 4, dim_1: 1)>
array([[5],
       [6],
       [7],
       [8]], dtype=int64)
Coordinates:
  * dim_0    (dim_0) int64 0 1 2 3
  * dim_1    (dim_1) int64 0</xarray.dataarray (dim_0: 4, dim_1: 1)>

5.使用给出的index,columns分别设定为lat,lonDataFrame,创建一个DataArray

lat = [5,6,7,8]
lon = [9,10,11,12]
temperature = np.ones((4,4))

df = pd.DataFrame(index=lat,columns=lon,data=temperature)
df.index.name = 'lat'
df.columns.name = 'lon'
df

lon

9

10

11

12

lat

5

1.0

1.0

1.0

1.0

6

1.0

1.0

1.0

1.0

7

1.0

1.0

1.0

1.0

8

1.0

1.0

1.0

1.0

da = xr.DataArray(df)
da

<xarray.dataarray (lat: 4, lon: 4)>
array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])
Coordinates:
  * lat      (lat) int64 5 6 7 8
  * lon      (lon) int64 9 10 11 12</xarray.dataarray (lat: 4, lon: 4)>

6.使用给定的day,temperature,直接创建一个一维的DataArray

day = ['2020-01-01','2020-01-02']
temperature = [20,21]

da = xr.DataArray(data=temperature, dims={'day':day})
da

D:\Anaconda3\lib\site-packages\xarray\core\dataarray.py:219: FutureWarning: The Panel class is removed from pandas. Accessing it from the top-level namespace will also be removed in the next version
  elif isinstance(data, pd.Panel):






array([20, 21])
Dimensions without coordinates: day

7.使用给定的lat,lon,temperature,直接创建一个二维的DataArray

lat = [5,6,7,8]
lon = [9,10,11,12]
temperature = np.ones((4,4))

da = xr.DataArray(data=temperature, dims=['lat','lon'], coords=[lat,lon])
da

<xarray.dataarray (lat: 4, lon: 4)>
array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])
Coordinates:
  * lat      (lat) int32 5 6 7 8
  * lon      (lon) int32 9 10 11 12</xarray.dataarray (lat: 4, lon: 4)>

8.使用给定的day,lat,lon,temperature,直接创建一个三维的DataArray

day = ['2020-01-01','2020-01-02']
lat = [5,6,7,8]
lon = [9,10,11,12]
temperature = np.ones((2,4,4))

da = xr.DataArray(data=temperature, dims=['day','lat','lon'], coords=[day,lat,lon])
da

<xarray.dataarray (day: 2, lat: 4, lon: 4)>
array([[[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]]])
Coordinates:
  * day      (day) <u10 '2020-01-01' '2020-01-02'
  * lat      (lat) int32 5 6 7 8
  * lon      (lon) int32 9 10 11 12</u10 '2020-01-01' '2020-01-02'
</xarray.dataarray (day: 2, lat: 4, lon: 4)>

9.使用上一题的结果的DataArray,创建一个Dataset

ds = xr.Dataset(data_vars={'temperature':da})
ds


Dimensions:      (day: 2, lat: 4, lon: 4)
Coordinates:
  * day          (day) <u10 '2020-01-01' '2020-01-02'
  * lat          (lat) int32 5 6 7 8
  * lon          (lon) int32 9 10 11 12
Data variables:
    temperature  (day, lat, lon) float64 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0</u10 '2020-01-01' '2020-01-02'

10.使用第8题的结果和给定的DataArray,创建一个有2个变量的Dataset

day = ['2020-01-01','2020-01-02']
lat = [5,6,7,8]
lon = [9,10,11,12]
pressure = np.ones((2,4,4))*2

da_2 = xr.DataArray(data=pressure, dims=['day','lat','lon'], coords=[day,lat,lon])
da_2

<xarray.dataarray (day: 2, lat: 4, lon: 4)>
array([[[2., 2., 2., 2.],
        [2., 2., 2., 2.],
        [2., 2., 2., 2.],
        [2., 2., 2., 2.]],

       [[2., 2., 2., 2.],
        [2., 2., 2., 2.],
        [2., 2., 2., 2.],
        [2., 2., 2., 2.]]])
Coordinates:
  * day      (day) <u10 '2020-01-01' '2020-01-02'
  * lat      (lat) int32 5 6 7 8
  * lon      (lon) int32 9 10 11 12</u10 '2020-01-01' '2020-01-02'
</xarray.dataarray (day: 2, lat: 4, lon: 4)>

ds = xr.Dataset(data_vars={'temperature':da, 
                           'pressure':da_2})
ds


Dimensions:      (day: 2, lat: 4, lon: 4)
Coordinates:
  * day          (day) <u10 '2020-01-01' '2020-01-02'
  * lat          (lat) int32 5 6 7 8
  * lon          (lon) int32 9 10 11 12
Data variables:
    temperature  (day, lat, lon) float64 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0
    pressure     (day, lat, lon) float64 2.0 2.0 2.0 2.0 2.0 ... 2.0 2.0 2.0 2.0</u10 '2020-01-01' '2020-01-02'

11.使用给定的lat,lon,temperature,直接创建一个Dataset

lat = [5,6,7,8]
lon = [9,10,11,12]
temperature = np.ones((4,4))

ds = xr.Dataset(
    data_vars={'temperature': (('lat', 'lon'),temperature)},       
    coords={'lat':('lat',lat),
            'lon':('lon',lon) })
ds


Dimensions:      (lat: 4, lon: 4)
Coordinates:
  * lat          (lat) int32 5 6 7 8
  * lon          (lon) int32 9 10 11 12
Data variables:
    temperature  (lat, lon) float64 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0 1.0

12.使用给定的day,lat,lon,temperature,直接创建一个Dataset

day = ['2020-01-01','2020-01-02']
lat = [5,6,7,8]
lon = [9,10,11,12]
temperature = np.ones((2,4,4))

ds = xr.Dataset(
    data_vars={'temperature': (('day','lat', 'lon',),temperature)},       
    coords={'lat':('lat',lat),
            'lon':('lon',lon),
            'day':('day',day)})
ds


Dimensions:      (day: 2, lat: 4, lon: 4)
Coordinates:
  * lat          (lat) int32 5 6 7 8
  * lon          (lon) int32 9 10 11 12
  * day          (day) <u10 '2020-01-01' '2020-01-02'
Data variables:
    temperature  (day, lat, lon) float64 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0</u10 '2020-01-01' '2020-01-02'

13.使用给定的day,lat,lon,temperature,pressure,直接创建一个有两个变量的Dataset

day = ['2020-01-01','2020-01-02']
lat = [5,6,7,8]
lon = [9,10,11,12]
temperature = np.ones((2,4,4))
pressure = np.ones((2,4,4))*2

ds = xr.Dataset(data_vars={'temperature':(['day','lat','lon'],temperature), 
                           'pressure':(['day','lat','lon'],pressure)}, 
                coords={'lat': ('lat',lat), 
                        'lon': ('lon',lon), 
                        'day':('day',day)})

ds


Dimensions:      (day: 2, lat: 4, lon: 4)
Coordinates:
  * lat          (lat) int32 5 6 7 8
  * lon          (lon) int32 9 10 11 12
  * day          (day) <u10 '2020-01-01' '2020-01-02'
Data variables:
    temperature  (day, lat, lon) float64 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0
    pressure     (day, lat, lon) float64 2.0 2.0 2.0 2.0 2.0 ... 2.0 2.0 2.0 2.0</u10 '2020-01-01' '2020-01-02'

二、数据存取和基本属性

14.将第8题创建的DataArray保存为.nc文件

da.to_netcdf('dataarray.nc')

15.读取刚刚保存的DataArray文件

da = xr.open_dataarray('dataarray.nc')
da

<xarray.dataarray (day: 2, lat: 4, lon: 4)>
array([[[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]]])
Coordinates:
  * day      (day) object '2020-01-01' '2020-01-02'
  * lat      (lat) int32 5 6 7 8
  * lon      (lon) int32 9 10 11 12</xarray.dataarray (day: 2, lat: 4, lon: 4)>

16.给前一题的DataArray添加属性author:Heywhale

da.attrs['author']='Heywhale'
da

<xarray.dataarray (day: 2, lat: 4, lon: 4)>
array([[[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]]])
Coordinates:
  * day      (day) object '2020-01-01' '2020-01-02'
  * lat      (lat) int32 5 6 7 8
  * lon      (lon) int32 9 10 11 12
Attributes:
    author:   Heywhale</xarray.dataarray (day: 2, lat: 4, lon: 4)>

17.将第13题的Dataset保存为.nc文件

ds.to_netcdf('dataset.nc')

18.读取刚刚保存的Dataset文件

data = xr.open_dataset('dataset.nc')
data


Dimensions:      (day: 2, lat: 4, lon: 4)
Coordinates:
  * lat          (lat) int32 5 6 7 8
  * lon          (lon) int32 9 10 11 12
  * day          (day) object '2020-01-01' '2020-01-02'
Data variables:
    temperature  (day, lat, lon) float64 ...
    pressure     (day, lat, lon) float64 ...

19.给前一题的Dataset添加数据集属性time:2022-05-17

data.attrs['time']='2022-05-17'
data


Dimensions:      (day: 2, lat: 4, lon: 4)
Coordinates:
  * lat          (lat) int32 5 6 7 8
  * lon          (lon) int32 9 10 11 12
  * day          (day) object '2020-01-01' '2020-01-02'
Data variables:
    temperature  (day, lat, lon) float64 ...
    pressure     (day, lat, lon) float64 ...
Attributes:
    time:     2022-05-17

20.查看15题中Dataset的维度,坐标,变量,属性

print(data.dims)
print('-----'*5)
print(data.coords)
print('-----'*5)
print(data.data_vars)
print('-----'*5)
print(data.attrs)

Frozen(SortedKeysDict({'day': 2, 'lat': 4, 'lon': 4}))
-------------------------
Coordinates:
  * lat      (lat) int32 5 6 7 8
  * lon      (lon) int32 9 10 11 12
  * day      (day) object '2020-01-01' '2020-01-02'
-------------------------
Data variables:
    temperature  (day, lat, lon) float64 ...
    pressure     (day, lat, lon) float64 ...
-------------------------
OrderedDict([('time', '2022-05-17')])

21.从18题的Dataset中取出temperatureDataArray

temp = data['temperature']
temp

<xarray.dataarray 'temperature' (day: 2, lat: 4, lon: 4)>
array([[[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]]])
Coordinates:
  * lat      (lat) int32 5 6 7 8
  * lon      (lon) int32 9 10 11 12
  * day      (day) object '2020-01-01' '2020-01-02'</xarray.dataarray 'temperature' (day: 2, lat: 4, lon: 4)>

三、索引、切片

22.取出daday为2020-01-01,lat为5,lon为9的数据

day = ['2020-01-01','2020-01-02','2020-01-03']
lat = [5,6,7,8]
lon = [9,10,11,12]
temperature = np.array(range(3*4*4)).reshape((3,4,4))

da = xr.DataArray(data=temperature, dims=['day','lat','lon'], coords=[day,lat,lon])
da

<xarray.dataarray (day: 3, lat: 4, lon: 4)>
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11],
        [12, 13, 14, 15]],

       [[16, 17, 18, 19],
        [20, 21, 22, 23],
        [24, 25, 26, 27],
        [28, 29, 30, 31]],

       [[32, 33, 34, 35],
        [36, 37, 38, 39],
        [40, 41, 42, 43],
        [44, 45, 46, 47]]])
Coordinates:
  * day      (day) <u10 '2020-01-01' '2020-01-02' '2020-01-03'
  * lat      (lat) int32 5 6 7 8
  * lon      (lon) int32 9 10 11 12</u10 '2020-01-01' '2020-01-02' '2020-01-03'
</xarray.dataarray (day: 3, lat: 4, lon: 4)>

#方法一
da[0,0,0]


array(0)
Coordinates:
    day      <u10 '2020-01-01'
    lat      int32 5
    lon      int32 9</u10 '2020-01-01'

#方法二
da.loc['2020-01-01',5,9]


array(0)
Coordinates:
    day      <u10 '2020-01-01'
    lat      int32 5
    lon      int32 9</u10 '2020-01-01'

#方法三
da.isel(day=0,lat=0,lon=0)


array(0)
Coordinates:
    day      <u10 '2020-01-01'
    lat      int32 5
    lon      int32 9</u10 '2020-01-01'

#方法四
da.sel(day='2020-01-01',lat=5,lon=9)


array(0)
Coordinates:
    day      <u10 '2020-01-01'
    lat      int32 5
    lon      int32 9</u10 '2020-01-01'

23.取出上一题daday为2020-01-01和2020-01-02两天的数据

#方法一
da[0:2,:,:]

<xarray.dataarray (day: 2, lat: 4, lon: 4)>
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11],
        [12, 13, 14, 15]],

       [[16, 17, 18, 19],
        [20, 21, 22, 23],
        [24, 25, 26, 27],
        [28, 29, 30, 31]]])
Coordinates:
  * day      (day) <u10 '2020-01-01' '2020-01-02'
  * lat      (lat) int32 5 6 7 8
  * lon      (lon) int32 9 10 11 12</u10 '2020-01-01' '2020-01-02'
</xarray.dataarray (day: 2, lat: 4, lon: 4)>

#方法二
da.loc['2020-01-01':'2020-01-02',:,:]

<xarray.dataarray (day: 2, lat: 4, lon: 4)>
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11],
        [12, 13, 14, 15]],

       [[16, 17, 18, 19],
        [20, 21, 22, 23],
        [24, 25, 26, 27],
        [28, 29, 30, 31]]])
Coordinates:
  * day      (day) <u10 '2020-01-01' '2020-01-02'
  * lat      (lat) int32 5 6 7 8
  * lon      (lon) int32 9 10 11 12</u10 '2020-01-01' '2020-01-02'
</xarray.dataarray (day: 2, lat: 4, lon: 4)>

#方法三
da.isel(day=[0,1])

<xarray.dataarray (day: 2, lat: 4, lon: 4)>
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11],
        [12, 13, 14, 15]],

       [[16, 17, 18, 19],
        [20, 21, 22, 23],
        [24, 25, 26, 27],
        [28, 29, 30, 31]]])
Coordinates:
  * day      (day) <u10 '2020-01-01' '2020-01-02'
  * lat      (lat) int32 5 6 7 8
  * lon      (lon) int32 9 10 11 12</u10 '2020-01-01' '2020-01-02'
</xarray.dataarray (day: 2, lat: 4, lon: 4)>

#方法四
da.sel(day=['2020-01-01','2020-01-02'])

<xarray.dataarray (day: 2, lat: 4, lon: 4)>
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11],
        [12, 13, 14, 15]],

       [[16, 17, 18, 19],
        [20, 21, 22, 23],
        [24, 25, 26, 27],
        [28, 29, 30, 31]]])
Coordinates:
  * day      (day) <u10 '2020-01-01' '2020-01-02'
  * lat      (lat) int32 5 6 7 8
  * lon      (lon) int32 9 10 11 12</u10 '2020-01-01' '2020-01-02'
</xarray.dataarray (day: 2, lat: 4, lon: 4)>

24.取出dsday为2020-01-01的数据

day = ['2020-01-01','2020-01-02','2020-01-03']
lat = [5,6,7,8]
lon = [9,10,11,12]
temperature = np.array(range(3*4*4)).reshape((3,4,4))

da = xr.DataArray(data=temperature, dims=['day','lat','lon'], coords=[day,lat,lon])
ds = da.to_dataset(name = 'temperature')
ds


Dimensions:      (day: 3, lat: 4, lon: 4)
Coordinates:
  * day          (day) <u10 '2020-01-01' '2020-01-02' '2020-01-03'
  * lat          (lat) int32 5 6 7 8
  * lon          (lon) int32 9 10 11 12
Data variables:
    temperature  (day, lat, lon) int32 0 1 2 3 4 5 6 7 ... 41 42 43 44 45 46 47</u10 '2020-01-01' '2020-01-02' '2020-01-03'

#方法一
ds.isel(day=0)


Dimensions:      (lat: 4, lon: 4)
Coordinates:
    day          <u10 '2020-01-01'
  * lat          (lat) int32 5 6 7 8
  * lon          (lon) int32 9 10 11 12
Data variables:
    temperature  (lat, lon) int32 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15</u10 '2020-01-01'

#方法二
ds.sel(day='2020-01-01')


Dimensions:      (lat: 4, lon: 4)
Coordinates:
    day          <u10 '2020-01-01'
  * lat          (lat) int32 5 6 7 8
  * lon          (lon) int32 9 10 11 12
Data variables:
    temperature  (lat, lon) int32 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15</u10 '2020-01-01'

25.取出上一题dsday为2020-01-01和2020-01-03的数据

#方法一
ds.isel(day=[0,2])


Dimensions:      (day: 2, lat: 4, lon: 4)
Coordinates:
  * day          (day) <u10 '2020-01-01' '2020-01-03'
  * lat          (lat) int32 5 6 7 8
  * lon          (lon) int32 9 10 11 12
Data variables:
    temperature  (day, lat, lon) int32 0 1 2 3 4 5 6 7 ... 41 42 43 44 45 46 47</u10 '2020-01-01' '2020-01-03'

#方法二
ds.sel(day=['2020-01-01','2020-01-03'])


Dimensions:      (day: 2, lat: 4, lon: 4)
Coordinates:
  * day          (day) <u10 '2020-01-01' '2020-01-03'
  * lat          (lat) int32 5 6 7 8
  * lon          (lon) int32 9 10 11 12
Data variables:
    temperature  (day, lat, lon) int32 0 1 2 3 4 5 6 7 ... 41 42 43 44 45 46 47</u10 '2020-01-01' '2020-01-03'

26.取出上一题dstemperature变量中第一天的数据

ds['temperature'][0]

<xarray.dataarray 'temperature' (lat: 4, lon: 4)>
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])
Coordinates:
    day      <u10 '2020-01-01'
  * lat      (lat) int32 5 6 7 8
  * lon      (lon) int32 9 10 11 12</u10 '2020-01-01'
</xarray.dataarray 'temperature' (lat: 4, lon: 4)>

27.取出上一题dstemperature变量中第二和第三天的的数据

ds['temperature'][1:3]

<xarray.dataarray 'temperature' (day: 2, lat: 4, lon: 4)>
array([[[16, 17, 18, 19],
        [20, 21, 22, 23],
        [24, 25, 26, 27],
        [28, 29, 30, 31]],

       [[32, 33, 34, 35],
        [36, 37, 38, 39],
        [40, 41, 42, 43],
        [44, 45, 46, 47]]])
Coordinates:
  * day      (day) <u10 '2020-01-02' '2020-01-03'
  * lat      (lat) int32 5 6 7 8
  * lon      (lon) int32 9 10 11 12</u10 '2020-01-02' '2020-01-03'
</xarray.dataarray 'temperature' (day: 2, lat: 4, lon: 4)>

四、计算、groupby

28.把da中第三行第四列的值赋为5

lat = [5,6,7,8]
lon = [9,10,11,12]
temperature = np.ones((4,4))
da = xr.DataArray(data=temperature, dims=['lat','lon'], coords=[lat,lon])
da

<xarray.dataarray (lat: 4, lon: 4)>
array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])
Coordinates:
  * lat      (lat) int32 5 6 7 8
  * lon      (lon) int32 9 10 11 12</xarray.dataarray (lat: 4, lon: 4)>

da[2,3]=5
da

<xarray.dataarray (lat: 4, lon: 4)>
array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 5.],
       [1., 1., 1., 1.]])
Coordinates:
  * lat      (lat) int32 5 6 7 8
  * lon      (lon) int32 9 10 11 12</xarray.dataarray (lat: 4, lon: 4)>

29.将上一题da中所有元素减去4

da-4

<xarray.dataarray (lat: 4, lon: 4)>
array([[-3., -3., -3., -3.],
       [-3., -3., -3., -3.],
       [-3., -3., -3.,  1.],
       [-3., -3., -3., -3.]])
Coordinates:
  * lat      (lat) int32 5 6 7 8
  * lon      (lon) int32 9 10 11 12</xarray.dataarray (lat: 4, lon: 4)>

30.将28题da中所有元素乘以3

da*3

<xarray.dataarray (lat: 4, lon: 4)>
array([[ 3.,  3.,  3.,  3.],
       [ 3.,  3.,  3.,  3.],
       [ 3.,  3.,  3., 15.],
       [ 3.,  3.,  3.,  3.]])
Coordinates:
  * lat      (lat) int32 5 6 7 8
  * lon      (lon) int32 9 10 11 12</xarray.dataarray (lat: 4, lon: 4)>

31.将28题da中所有元素取sin值

np.sin(da)

<xarray.dataarray (lat: 4, lon: 4)>
array([[ 0.841471,  0.841471,  0.841471,  0.841471],
       [ 0.841471,  0.841471,  0.841471,  0.841471],
       [ 0.841471,  0.841471,  0.841471, -0.958924],
       [ 0.841471,  0.841471,  0.841471,  0.841471]])
Coordinates:
  * lat      (lat) int32 5 6 7 8
  * lon      (lon) int32 9 10 11 12</xarray.dataarray (lat: 4, lon: 4)>

32.对28题da做数组的转置

da.T

<xarray.dataarray (lon: 4, lat: 4)>
array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 5., 1.]])
Coordinates:
  * lat      (lat) int32 5 6 7 8
  * lon      (lon) int32 9 10 11 12</xarray.dataarray (lon: 4, lat: 4)>

33.将上一题中转置后的数组元素除以2

da.T/2

<xarray.dataarray (lon: 4, lat: 4)>
array([[0.5, 0.5, 0.5, 0.5],
       [0.5, 0.5, 0.5, 0.5],
       [0.5, 0.5, 0.5, 0.5],
       [0.5, 0.5, 2.5, 0.5]])
Coordinates:
  * lat      (lat) int32 5 6 7 8
  * lon      (lon) int32 9 10 11 12</xarray.dataarray (lon: 4, lat: 4)>

34.对28题da所有元素求和

da.sum()


array(20.)

35.对28题da沿着lat维度求平均

da.mean(dim='lat')


array([1., 1., 1., 2.])
Coordinates:
  * lon      (lon) int32 9 10 11 12

36.对ds中的temperature变量求绝对值

lat = [5,6,7,8]
lon = [9,10,11,12]
temperature = np.ones((4,4))*(-1)
ds = xr.Dataset(
    data_vars={'temperature': (('lat', 'lon'),temperature)},       
    coords={'lat':('lat',lat),
            'lon':('lon',lon) })
ds


Dimensions:      (lat: 4, lon: 4)
Coordinates:
  * lat          (lat) int32 5 6 7 8
  * lon          (lon) int32 9 10 11 12
Data variables:
    temperature  (lat, lon) float64 -1.0 -1.0 -1.0 -1.0 ... -1.0 -1.0 -1.0 -1.0

abs(ds['temperature'])

<xarray.dataarray 'temperature' (lat: 4, lon: 4)>
array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])
Coordinates:
  * lat      (lat) int32 5 6 7 8
  * lon      (lon) int32 9 10 11 12</xarray.dataarray 'temperature' (lat: 4, lon: 4)>

37.指定da中的lat维度做聚合

lat = [5,6,5,6]
lon = [9,10,11,12]
temperature = np.arange(1,17).reshape(4,4)
da = xr.DataArray(data=temperature, dims=['lat','lon'], coords=[lat,lon])
da

<xarray.dataarray (lat: 4, lon: 4)>
array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12],
       [13, 14, 15, 16]])
Coordinates:
  * lat      (lat) int32 5 6 5 6
  * lon      (lon) int32 9 10 11 12</xarray.dataarray (lat: 4, lon: 4)>

da.groupby('lat')

<xarray.core.groupby.dataarraygroupby at 0x1fb6d5ae400></xarray.core.groupby.dataarraygroupby at 0x1fb6d5ae400>

38.查看上一题聚合后的结果

da.groupby('lat').groups

{5: [0, 2], 6: [1, 3]}

list(da.groupby('lat'))

[(5,
  <xarray.dataarray (lat: 2, lon: 4)>
  array([[ 1,  2,  3,  4],
         [ 9, 10, 11, 12]])
  Coordinates:
    * lat      (lat) int32 5 5
    * lon      (lon) int32 9 10 11 12),
 (6,
  <xarray.dataarray (lat: 2, lon: 4)>
  array([[ 5,  6,  7,  8],
         [13, 14, 15, 16]])
  Coordinates:
    * lat      (lat) int32 6 6
    * lon      (lon) int32 9 10 11 12)]</xarray.dataarray (lat: 2, lon: 4)></xarray.dataarray (lat: 2, lon: 4)>

39.对上一题聚合后的结果求平均

da.groupby('lat').mean()

D:\Anaconda3\lib\site-packages\xarray\core\groupby.py:639: FutureWarning: Default reduction dimension will be changed to the grouped dimension in a future version of xarray. To silence this warning, pass dim=xarray.ALL_DIMS explicitly.
  skipna=skipna, allow_lazy=True, **kwargs)






array([ 6.5, 10.5])
Coordinates:
  * lat      (lat) int64 5 6

40.指定ds中的lon维度做聚合

lat = [5,6,7,8]
lon = [9,20,20,20]
temperature = np.ones((4,4))*(-1)
ds = xr.Dataset(
    data_vars={'temperature': (('lat', 'lon'),temperature)},       
    coords={'lat':('lat',lat),
            'lon':('lon',lon) })
ds


Dimensions:      (lat: 4, lon: 4)
Coordinates:
  * lat          (lat) int32 5 6 7 8
  * lon          (lon) int32 9 20 20 20
Data variables:
    temperature  (lat, lon) float64 -1.0 -1.0 -1.0 -1.0 ... -1.0 -1.0 -1.0 -1.0

ds.groupby('lon')

<xarray.core.groupby.datasetgroupby at 0x1fb6d59af98></xarray.core.groupby.datasetgroupby at 0x1fb6d59af98>

list(ds.groupby('lon'))

[(9,
  
  Dimensions:      (lat: 4, lon: 1)
  Coordinates:
    * lat          (lat) int32 5 6 7 8
    * lon          (lon) int32 9
  Data variables:
      temperature  (lat, lon) float64 -1.0 -1.0 -1.0 -1.0),
 (20,
  
  Dimensions:      (lat: 4, lon: 3)
  Coordinates:
    * lat          (lat) int32 5 6 7 8
    * lon          (lon) int32 20 20 20
  Data variables:
      temperature  (lat, lon) float64 -1.0 -1.0 -1.0 -1.0 ... -1.0 -1.0 -1.0 -1.0)]

五、缺失值处理、插值

41.查看da中哪些值是缺失值

da = xr.DataArray([3,4,np.nan,6,np.nan,8],dims=['lat'])
da


array([ 3.,  4., nan,  6., nan,  8.])
Dimensions without coordinates: lat

da.isnull()


array([False, False,  True, False,  True, False])
Dimensions without coordinates: lat

42.查看上一题的da中哪些值不是缺失值

da.notnull()


array([ True,  True, False,  True, False,  True])
Dimensions without coordinates: lat

43.查看41题的da中非缺失值元素的个数

da.count()


array(4)

44.去除41题的dalat维度的缺失值

da.dropna(dim='lat')


array([3., 4., 6., 8.])
Dimensions without coordinates: lat

45.填充41题的da中的缺失值

da.fillna(999)


array([  3.,   4., 999.,   6., 999.,   8.])
Dimensions without coordinates: lat

46.使用线性方法插值41题的da中的缺失值

da.interpolate_na(dim='lat',method='linear')


array([3., 4., 5., 6., 7., 8.])
Dimensions without coordinates: lat

六、拼接

47.沿着day维度拼接da_1da_2

day = ['day1','day2','day3','day4']
da_1 = xr.DataArray([1,2,3,4],dims=['day'],coords=[day])
day = ['day5','day6','day7','day8']
da_2 = xr.DataArray([5,6,7,8],dims=['day'],coords=[day])


print(da_1)
print('-----'*5)
print(da_2)


array([1, 2, 3, 4])
Coordinates:
  * day      (day) <u4 'day1' 'day2' 'day3' 'day4'
-------------------------

array([5, 6, 7, 8])
Coordinates:
  * day      (day) <u4 'day5' 'day6' 'day7' 'day8'< code=""></u4 'day5' 'day6' 'day7' 'day8'<></u4 'day1' 'day2' 'day3' 'day4'

xr.concat([da_1,da_2],dim='day')


array([1, 2, 3, 4, 5, 6, 7, 8])
Coordinates:
  * day      (day) object 'day1' 'day2' 'day3' 'day4' ... 'day6' 'day7' 'day8'

48.沿着day维度拼接ds_1中的temperature1变量和ds_2中的temperature2变量

day = ['day1','day2','day3','day4']
da_1 = xr.DataArray([1,2,3,4],dims=['day'],coords=[day])
ds_1 = da_1.to_dataset(name = 'temperature1')
day = ['day5','day6','day7','day8']
da_2 = xr.DataArray([5,6,7,8],dims=['day'],coords=[day])
ds_2 = da_2.to_dataset(name = 'temperature2')
print(ds_1)
print('-----'*5)
print(ds_2)


Dimensions:       (day: 4)
Coordinates:
  * day           (day) <u4 'day1' 'day2' 'day3' 'day4'
Data variables:
    temperature1  (day) int32 1 2 3 4
-------------------------

Dimensions:       (day: 4)
Coordinates:
  * day           (day) <u4 'day5' 'day6' 'day7' 'day8'
Data variables:
    temperature2  (day) int32 5 6 7 8</u4 'day5' 'day6' 'day7' 'day8'
</u4 'day1' 'day2' 'day3' 'day4'

xr.concat([ds_1['temperature1'],ds_2['temperature2']],dim='day')


array([1, 2, 3, 4, 5, 6, 7, 8])
Coordinates:
  * day      (day) object 'day1' 'day2' 'day3' 'day4' ... 'day6' 'day7' 'day8'

49.沿着day维度拼接ds_1和ds_2(两个Dataset具有相同的变量)

day = ['day1','day2','day3','day4']
da_1 = xr.DataArray([1,2,3,4],dims=['day'],coords=[day])
ds_1 = da_1.to_dataset(name = 'temperature')
day = ['day5','day6','day7','day8']
da_2 = xr.DataArray([5,6,7,8],dims=['day'],coords=[day])
ds_2 = da_2.to_dataset(name = 'temperature')
print(ds_1)
print('-----'*5)
print(ds_2)


Dimensions:      (day: 4)
Coordinates:
  * day          (day) <u4 'day1' 'day2' 'day3' 'day4'
Data variables:
    temperature  (day) int32 1 2 3 4
-------------------------

Dimensions:      (day: 4)
Coordinates:
  * day          (day) <u4 'day5' 'day6' 'day7' 'day8'
Data variables:
    temperature  (day) int32 5 6 7 8</u4 'day5' 'day6' 'day7' 'day8'
</u4 'day1' 'day2' 'day3' 'day4'

#方法一
xr.concat([ds_1,ds_2],dim='day')


Dimensions:      (day: 8)
Coordinates:
  * day          (day) object 'day1' 'day2' 'day3' ... 'day6' 'day7' 'day8'
Data variables:
    temperature  (day) int32 1 2 3 4 5 6 7 8

#方法二
xr.merge([ds_1,ds_2])


Dimensions:      (day: 8)
Coordinates:
  * day          (day) object 'day1' 'day2' 'day3' ... 'day6' 'day7' 'day8'
Data variables:
    temperature  (day) float64 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0

50.沿着day维度拼接ds_1和ds_2(两个Dataset具有不同的变量)

day = ['day1','day2','day3','day4']
da_1 = xr.DataArray([1,2,3,4],dims=['day'],coords=[day])
ds_1 = da_1.to_dataset(name = 'temperature1')
day = ['day5','day6','day7','day8']
da_2 = xr.DataArray([5,6,7,8],dims=['day'],coords=[day])
ds_2 = da_2.to_dataset(name = 'temperature2')
print(ds_1)
print('-----'*5)
print(ds_2)


Dimensions:       (day: 4)
Coordinates:
  * day           (day) <u4 'day1' 'day2' 'day3' 'day4'
Data variables:
    temperature1  (day) int32 1 2 3 4
-------------------------

Dimensions:       (day: 4)
Coordinates:
  * day           (day) <u4 'day5' 'day6' 'day7' 'day8'
Data variables:
    temperature2  (day) int32 5 6 7 8</u4 'day5' 'day6' 'day7' 'day8'
</u4 'day1' 'day2' 'day3' 'day4'

xr.merge([ds_1,ds_2])


Dimensions:       (day: 8)
Coordinates:
  * day           (day) object 'day1' 'day2' 'day3' ... 'day6' 'day7' 'day8'
Data variables:
    temperature1  (day) float64 1.0 2.0 3.0 4.0 nan nan nan nan
    temperature2  (day) float64 nan nan nan nan 5.0 6.0 7.0 8.0

正文完