1.导入xarray
库,简写为xr
import xarray as xr
import numpy as np
import pandas as pd
D:\Anaconda3\lib\site-packages\xarray\core\merge.py:17: FutureWarning: The Panel class is removed from pandas. Accessing it from the top-level namespace will also be removed in the next version
PANDAS_TYPES = (pd.Series, pd.DataFrame, pd.Panel)
一、创建xarray对象
2.使用给出的Series
,创建一个DataArray
s = pd.Series([5,6,7,8])
s
0 5
1 6
2 7
3 8
dtype: int64
da = xr.DataArray.from_series(s)
da
array([5, 6, 7, 8], dtype=int64)
Coordinates:
* index (index) int64 0 1 2 3
3.使用给出的MultiIndex
的Series
,创建一个DataArray
lat = [5,6,7,8]
lon = [9,10,11,12]
idx = pd.MultiIndex.from_arrays(arrays=[lat,lon], names=["lat","lon"])
s = pd.Series(data=[1,2,3,4], index=idx)
s
lat lon
5 9 1
6 10 2
7 11 3
8 12 4
dtype: int64
da = xr.DataArray.from_series(s)
da
<xarray.dataarray (lat: 4, lon: 4)>
array([[ 1., nan, nan, nan],
[nan, 2., nan, nan],
[nan, nan, 3., nan],
[nan, nan, nan, 4.]])
Coordinates:
* lat (lat) int64 5 6 7 8
* lon (lon) int64 9 10 11 12</xarray.dataarray (lat: 4, lon: 4)>
4.使用给出的DataFrame
,创建一个DataArray
df = pd.DataFrame([5,6,7,8])
df
|
0 |
---|---|
0 |
5 |
1 |
6 |
2 |
7 |
3 |
8 |
da = xr.DataArray(df)
da
<xarray.dataarray (dim_0: 4, dim_1: 1)>
array([[5],
[6],
[7],
[8]], dtype=int64)
Coordinates:
* dim_0 (dim_0) int64 0 1 2 3
* dim_1 (dim_1) int64 0</xarray.dataarray (dim_0: 4, dim_1: 1)>
5.使用给出的index,columns
分别设定为lat,lon
的DataFrame
,创建一个DataArray
lat = [5,6,7,8]
lon = [9,10,11,12]
temperature = np.ones((4,4))
df = pd.DataFrame(index=lat,columns=lon,data=temperature)
df.index.name = 'lat'
df.columns.name = 'lon'
df
lon |
9 |
10 |
11 |
12 |
---|---|---|---|---|
lat |
|
|
|
|
5 |
1.0 |
1.0 |
1.0 |
1.0 |
6 |
1.0 |
1.0 |
1.0 |
1.0 |
7 |
1.0 |
1.0 |
1.0 |
1.0 |
8 |
1.0 |
1.0 |
1.0 |
1.0 |
da = xr.DataArray(df)
da
<xarray.dataarray (lat: 4, lon: 4)>
array([[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]])
Coordinates:
* lat (lat) int64 5 6 7 8
* lon (lon) int64 9 10 11 12</xarray.dataarray (lat: 4, lon: 4)>
6.使用给定的day,temperature
,直接创建一个一维的DataArray
day = ['2020-01-01','2020-01-02']
temperature = [20,21]
da = xr.DataArray(data=temperature, dims={'day':day})
da
D:\Anaconda3\lib\site-packages\xarray\core\dataarray.py:219: FutureWarning: The Panel class is removed from pandas. Accessing it from the top-level namespace will also be removed in the next version
elif isinstance(data, pd.Panel):
array([20, 21])
Dimensions without coordinates: day
7.使用给定的lat,lon,temperature
,直接创建一个二维的DataArray
lat = [5,6,7,8]
lon = [9,10,11,12]
temperature = np.ones((4,4))
da = xr.DataArray(data=temperature, dims=['lat','lon'], coords=[lat,lon])
da
<xarray.dataarray (lat: 4, lon: 4)>
array([[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]])
Coordinates:
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12</xarray.dataarray (lat: 4, lon: 4)>
8.使用给定的day,lat,lon,temperature
,直接创建一个三维的DataArray
day = ['2020-01-01','2020-01-02']
lat = [5,6,7,8]
lon = [9,10,11,12]
temperature = np.ones((2,4,4))
da = xr.DataArray(data=temperature, dims=['day','lat','lon'], coords=[day,lat,lon])
da
<xarray.dataarray (day: 2, lat: 4, lon: 4)>
array([[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]],
[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]]])
Coordinates:
* day (day) <u10 '2020-01-01' '2020-01-02'
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12</u10 '2020-01-01' '2020-01-02'
</xarray.dataarray (day: 2, lat: 4, lon: 4)>
9.使用上一题的结果的DataArray
,创建一个Dataset
ds = xr.Dataset(data_vars={'temperature':da})
ds
Dimensions: (day: 2, lat: 4, lon: 4)
Coordinates:
* day (day) <u10 '2020-01-01' '2020-01-02'
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12
Data variables:
temperature (day, lat, lon) float64 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0</u10 '2020-01-01' '2020-01-02'
10.使用第8题的结果和给定的DataArray
,创建一个有2个变量的Dataset
day = ['2020-01-01','2020-01-02']
lat = [5,6,7,8]
lon = [9,10,11,12]
pressure = np.ones((2,4,4))*2
da_2 = xr.DataArray(data=pressure, dims=['day','lat','lon'], coords=[day,lat,lon])
da_2
<xarray.dataarray (day: 2, lat: 4, lon: 4)>
array([[[2., 2., 2., 2.],
[2., 2., 2., 2.],
[2., 2., 2., 2.],
[2., 2., 2., 2.]],
[[2., 2., 2., 2.],
[2., 2., 2., 2.],
[2., 2., 2., 2.],
[2., 2., 2., 2.]]])
Coordinates:
* day (day) <u10 '2020-01-01' '2020-01-02'
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12</u10 '2020-01-01' '2020-01-02'
</xarray.dataarray (day: 2, lat: 4, lon: 4)>
ds = xr.Dataset(data_vars={'temperature':da,
'pressure':da_2})
ds
Dimensions: (day: 2, lat: 4, lon: 4)
Coordinates:
* day (day) <u10 '2020-01-01' '2020-01-02'
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12
Data variables:
temperature (day, lat, lon) float64 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0
pressure (day, lat, lon) float64 2.0 2.0 2.0 2.0 2.0 ... 2.0 2.0 2.0 2.0</u10 '2020-01-01' '2020-01-02'
11.使用给定的lat,lon,temperature
,直接创建一个Dataset
lat = [5,6,7,8]
lon = [9,10,11,12]
temperature = np.ones((4,4))
ds = xr.Dataset(
data_vars={'temperature': (('lat', 'lon'),temperature)},
coords={'lat':('lat',lat),
'lon':('lon',lon) })
ds
Dimensions: (lat: 4, lon: 4)
Coordinates:
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12
Data variables:
temperature (lat, lon) float64 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0 1.0
12.使用给定的day,lat,lon,temperature
,直接创建一个Dataset
day = ['2020-01-01','2020-01-02']
lat = [5,6,7,8]
lon = [9,10,11,12]
temperature = np.ones((2,4,4))
ds = xr.Dataset(
data_vars={'temperature': (('day','lat', 'lon',),temperature)},
coords={'lat':('lat',lat),
'lon':('lon',lon),
'day':('day',day)})
ds
Dimensions: (day: 2, lat: 4, lon: 4)
Coordinates:
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12
* day (day) <u10 '2020-01-01' '2020-01-02'
Data variables:
temperature (day, lat, lon) float64 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0</u10 '2020-01-01' '2020-01-02'
13.使用给定的day,lat,lon,temperature,pressure
,直接创建一个有两个变量的Dataset
day = ['2020-01-01','2020-01-02']
lat = [5,6,7,8]
lon = [9,10,11,12]
temperature = np.ones((2,4,4))
pressure = np.ones((2,4,4))*2
ds = xr.Dataset(data_vars={'temperature':(['day','lat','lon'],temperature),
'pressure':(['day','lat','lon'],pressure)},
coords={'lat': ('lat',lat),
'lon': ('lon',lon),
'day':('day',day)})
ds
Dimensions: (day: 2, lat: 4, lon: 4)
Coordinates:
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12
* day (day) <u10 '2020-01-01' '2020-01-02'
Data variables:
temperature (day, lat, lon) float64 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0
pressure (day, lat, lon) float64 2.0 2.0 2.0 2.0 2.0 ... 2.0 2.0 2.0 2.0</u10 '2020-01-01' '2020-01-02'
二、数据存取和基本属性
14.将第8题创建的DataArray
保存为.nc
文件
da.to_netcdf('dataarray.nc')
15.读取刚刚保存的DataArray
文件
da = xr.open_dataarray('dataarray.nc')
da
<xarray.dataarray (day: 2, lat: 4, lon: 4)>
array([[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]],
[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]]])
Coordinates:
* day (day) object '2020-01-01' '2020-01-02'
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12</xarray.dataarray (day: 2, lat: 4, lon: 4)>
16.给前一题的DataArray
添加属性author:Heywhale
da.attrs['author']='Heywhale'
da
<xarray.dataarray (day: 2, lat: 4, lon: 4)>
array([[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]],
[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]]])
Coordinates:
* day (day) object '2020-01-01' '2020-01-02'
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12
Attributes:
author: Heywhale</xarray.dataarray (day: 2, lat: 4, lon: 4)>
17.将第13题的Dataset
保存为.nc
文件
ds.to_netcdf('dataset.nc')
18.读取刚刚保存的Dataset
文件
data = xr.open_dataset('dataset.nc')
data
Dimensions: (day: 2, lat: 4, lon: 4)
Coordinates:
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12
* day (day) object '2020-01-01' '2020-01-02'
Data variables:
temperature (day, lat, lon) float64 ...
pressure (day, lat, lon) float64 ...
19.给前一题的Dataset
添加数据集属性time:2022-05-17
data.attrs['time']='2022-05-17'
data
Dimensions: (day: 2, lat: 4, lon: 4)
Coordinates:
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12
* day (day) object '2020-01-01' '2020-01-02'
Data variables:
temperature (day, lat, lon) float64 ...
pressure (day, lat, lon) float64 ...
Attributes:
time: 2022-05-17
20.查看15题中Dataset
的维度,坐标,变量,属性
print(data.dims)
print('-----'*5)
print(data.coords)
print('-----'*5)
print(data.data_vars)
print('-----'*5)
print(data.attrs)
Frozen(SortedKeysDict({'day': 2, 'lat': 4, 'lon': 4}))
-------------------------
Coordinates:
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12
* day (day) object '2020-01-01' '2020-01-02'
-------------------------
Data variables:
temperature (day, lat, lon) float64 ...
pressure (day, lat, lon) float64 ...
-------------------------
OrderedDict([('time', '2022-05-17')])
21.从18题的Dataset
中取出temperature
的DataArray
temp = data['temperature']
temp
<xarray.dataarray 'temperature' (day: 2, lat: 4, lon: 4)>
array([[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]],
[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]]])
Coordinates:
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12
* day (day) object '2020-01-01' '2020-01-02'</xarray.dataarray 'temperature' (day: 2, lat: 4, lon: 4)>
三、索引、切片
22.取出da
中day
为2020-01-01,lat
为5,lon
为9的数据
day = ['2020-01-01','2020-01-02','2020-01-03']
lat = [5,6,7,8]
lon = [9,10,11,12]
temperature = np.array(range(3*4*4)).reshape((3,4,4))
da = xr.DataArray(data=temperature, dims=['day','lat','lon'], coords=[day,lat,lon])
da
<xarray.dataarray (day: 3, lat: 4, lon: 4)>
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]],
[[16, 17, 18, 19],
[20, 21, 22, 23],
[24, 25, 26, 27],
[28, 29, 30, 31]],
[[32, 33, 34, 35],
[36, 37, 38, 39],
[40, 41, 42, 43],
[44, 45, 46, 47]]])
Coordinates:
* day (day) <u10 '2020-01-01' '2020-01-02' '2020-01-03'
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12</u10 '2020-01-01' '2020-01-02' '2020-01-03'
</xarray.dataarray (day: 3, lat: 4, lon: 4)>
#方法一
da[0,0,0]
array(0)
Coordinates:
day <u10 '2020-01-01'
lat int32 5
lon int32 9</u10 '2020-01-01'
#方法二
da.loc['2020-01-01',5,9]
array(0)
Coordinates:
day <u10 '2020-01-01'
lat int32 5
lon int32 9</u10 '2020-01-01'
#方法三
da.isel(day=0,lat=0,lon=0)
array(0)
Coordinates:
day <u10 '2020-01-01'
lat int32 5
lon int32 9</u10 '2020-01-01'
#方法四
da.sel(day='2020-01-01',lat=5,lon=9)
array(0)
Coordinates:
day <u10 '2020-01-01'
lat int32 5
lon int32 9</u10 '2020-01-01'
23.取出上一题da
中day
为2020-01-01和2020-01-02两天的数据
#方法一
da[0:2,:,:]
<xarray.dataarray (day: 2, lat: 4, lon: 4)>
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]],
[[16, 17, 18, 19],
[20, 21, 22, 23],
[24, 25, 26, 27],
[28, 29, 30, 31]]])
Coordinates:
* day (day) <u10 '2020-01-01' '2020-01-02'
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12</u10 '2020-01-01' '2020-01-02'
</xarray.dataarray (day: 2, lat: 4, lon: 4)>
#方法二
da.loc['2020-01-01':'2020-01-02',:,:]
<xarray.dataarray (day: 2, lat: 4, lon: 4)>
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]],
[[16, 17, 18, 19],
[20, 21, 22, 23],
[24, 25, 26, 27],
[28, 29, 30, 31]]])
Coordinates:
* day (day) <u10 '2020-01-01' '2020-01-02'
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12</u10 '2020-01-01' '2020-01-02'
</xarray.dataarray (day: 2, lat: 4, lon: 4)>
#方法三
da.isel(day=[0,1])
<xarray.dataarray (day: 2, lat: 4, lon: 4)>
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]],
[[16, 17, 18, 19],
[20, 21, 22, 23],
[24, 25, 26, 27],
[28, 29, 30, 31]]])
Coordinates:
* day (day) <u10 '2020-01-01' '2020-01-02'
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12</u10 '2020-01-01' '2020-01-02'
</xarray.dataarray (day: 2, lat: 4, lon: 4)>
#方法四
da.sel(day=['2020-01-01','2020-01-02'])
<xarray.dataarray (day: 2, lat: 4, lon: 4)>
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]],
[[16, 17, 18, 19],
[20, 21, 22, 23],
[24, 25, 26, 27],
[28, 29, 30, 31]]])
Coordinates:
* day (day) <u10 '2020-01-01' '2020-01-02'
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12</u10 '2020-01-01' '2020-01-02'
</xarray.dataarray (day: 2, lat: 4, lon: 4)>
24.取出ds
中day
为2020-01-01的数据
day = ['2020-01-01','2020-01-02','2020-01-03']
lat = [5,6,7,8]
lon = [9,10,11,12]
temperature = np.array(range(3*4*4)).reshape((3,4,4))
da = xr.DataArray(data=temperature, dims=['day','lat','lon'], coords=[day,lat,lon])
ds = da.to_dataset(name = 'temperature')
ds
Dimensions: (day: 3, lat: 4, lon: 4)
Coordinates:
* day (day) <u10 '2020-01-01' '2020-01-02' '2020-01-03'
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12
Data variables:
temperature (day, lat, lon) int32 0 1 2 3 4 5 6 7 ... 41 42 43 44 45 46 47</u10 '2020-01-01' '2020-01-02' '2020-01-03'
#方法一
ds.isel(day=0)
Dimensions: (lat: 4, lon: 4)
Coordinates:
day <u10 '2020-01-01'
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12
Data variables:
temperature (lat, lon) int32 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15</u10 '2020-01-01'
#方法二
ds.sel(day='2020-01-01')
Dimensions: (lat: 4, lon: 4)
Coordinates:
day <u10 '2020-01-01'
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12
Data variables:
temperature (lat, lon) int32 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15</u10 '2020-01-01'
25.取出上一题ds
中day
为2020-01-01和2020-01-03的数据
#方法一
ds.isel(day=[0,2])
Dimensions: (day: 2, lat: 4, lon: 4)
Coordinates:
* day (day) <u10 '2020-01-01' '2020-01-03'
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12
Data variables:
temperature (day, lat, lon) int32 0 1 2 3 4 5 6 7 ... 41 42 43 44 45 46 47</u10 '2020-01-01' '2020-01-03'
#方法二
ds.sel(day=['2020-01-01','2020-01-03'])
Dimensions: (day: 2, lat: 4, lon: 4)
Coordinates:
* day (day) <u10 '2020-01-01' '2020-01-03'
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12
Data variables:
temperature (day, lat, lon) int32 0 1 2 3 4 5 6 7 ... 41 42 43 44 45 46 47</u10 '2020-01-01' '2020-01-03'
26.取出上一题ds
下temperature
变量中第一天的数据
ds['temperature'][0]
<xarray.dataarray 'temperature' (lat: 4, lon: 4)>
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
Coordinates:
day <u10 '2020-01-01'
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12</u10 '2020-01-01'
</xarray.dataarray 'temperature' (lat: 4, lon: 4)>
27.取出上一题ds
下temperature
变量中第二和第三天的的数据
ds['temperature'][1:3]
<xarray.dataarray 'temperature' (day: 2, lat: 4, lon: 4)>
array([[[16, 17, 18, 19],
[20, 21, 22, 23],
[24, 25, 26, 27],
[28, 29, 30, 31]],
[[32, 33, 34, 35],
[36, 37, 38, 39],
[40, 41, 42, 43],
[44, 45, 46, 47]]])
Coordinates:
* day (day) <u10 '2020-01-02' '2020-01-03'
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12</u10 '2020-01-02' '2020-01-03'
</xarray.dataarray 'temperature' (day: 2, lat: 4, lon: 4)>
四、计算、groupby
28.把da
中第三行第四列的值赋为5
lat = [5,6,7,8]
lon = [9,10,11,12]
temperature = np.ones((4,4))
da = xr.DataArray(data=temperature, dims=['lat','lon'], coords=[lat,lon])
da
<xarray.dataarray (lat: 4, lon: 4)>
array([[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]])
Coordinates:
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12</xarray.dataarray (lat: 4, lon: 4)>
da[2,3]=5
da
<xarray.dataarray (lat: 4, lon: 4)>
array([[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 5.],
[1., 1., 1., 1.]])
Coordinates:
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12</xarray.dataarray (lat: 4, lon: 4)>
29.将上一题da
中所有元素减去4
da-4
<xarray.dataarray (lat: 4, lon: 4)>
array([[-3., -3., -3., -3.],
[-3., -3., -3., -3.],
[-3., -3., -3., 1.],
[-3., -3., -3., -3.]])
Coordinates:
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12</xarray.dataarray (lat: 4, lon: 4)>
30.将28题da
中所有元素乘以3
da*3
<xarray.dataarray (lat: 4, lon: 4)>
array([[ 3., 3., 3., 3.],
[ 3., 3., 3., 3.],
[ 3., 3., 3., 15.],
[ 3., 3., 3., 3.]])
Coordinates:
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12</xarray.dataarray (lat: 4, lon: 4)>
31.将28题da
中所有元素取sin值
np.sin(da)
<xarray.dataarray (lat: 4, lon: 4)>
array([[ 0.841471, 0.841471, 0.841471, 0.841471],
[ 0.841471, 0.841471, 0.841471, 0.841471],
[ 0.841471, 0.841471, 0.841471, -0.958924],
[ 0.841471, 0.841471, 0.841471, 0.841471]])
Coordinates:
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12</xarray.dataarray (lat: 4, lon: 4)>
32.对28题da
做数组的转置
da.T
<xarray.dataarray (lon: 4, lat: 4)>
array([[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 5., 1.]])
Coordinates:
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12</xarray.dataarray (lon: 4, lat: 4)>
33.将上一题中转置后的数组元素除以2
da.T/2
<xarray.dataarray (lon: 4, lat: 4)>
array([[0.5, 0.5, 0.5, 0.5],
[0.5, 0.5, 0.5, 0.5],
[0.5, 0.5, 0.5, 0.5],
[0.5, 0.5, 2.5, 0.5]])
Coordinates:
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12</xarray.dataarray (lon: 4, lat: 4)>
34.对28题da
所有元素求和
da.sum()
array(20.)
35.对28题da
沿着lat
维度求平均
da.mean(dim='lat')
array([1., 1., 1., 2.])
Coordinates:
* lon (lon) int32 9 10 11 12
36.对ds
中的temperature
变量求绝对值
lat = [5,6,7,8]
lon = [9,10,11,12]
temperature = np.ones((4,4))*(-1)
ds = xr.Dataset(
data_vars={'temperature': (('lat', 'lon'),temperature)},
coords={'lat':('lat',lat),
'lon':('lon',lon) })
ds
Dimensions: (lat: 4, lon: 4)
Coordinates:
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12
Data variables:
temperature (lat, lon) float64 -1.0 -1.0 -1.0 -1.0 ... -1.0 -1.0 -1.0 -1.0
abs(ds['temperature'])
<xarray.dataarray 'temperature' (lat: 4, lon: 4)>
array([[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]])
Coordinates:
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 10 11 12</xarray.dataarray 'temperature' (lat: 4, lon: 4)>
37.指定da
中的lat
维度做聚合
lat = [5,6,5,6]
lon = [9,10,11,12]
temperature = np.arange(1,17).reshape(4,4)
da = xr.DataArray(data=temperature, dims=['lat','lon'], coords=[lat,lon])
da
<xarray.dataarray (lat: 4, lon: 4)>
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12],
[13, 14, 15, 16]])
Coordinates:
* lat (lat) int32 5 6 5 6
* lon (lon) int32 9 10 11 12</xarray.dataarray (lat: 4, lon: 4)>
da.groupby('lat')
<xarray.core.groupby.dataarraygroupby at 0x1fb6d5ae400></xarray.core.groupby.dataarraygroupby at 0x1fb6d5ae400>
38.查看上一题聚合后的结果
da.groupby('lat').groups
{5: [0, 2], 6: [1, 3]}
list(da.groupby('lat'))
[(5,
<xarray.dataarray (lat: 2, lon: 4)>
array([[ 1, 2, 3, 4],
[ 9, 10, 11, 12]])
Coordinates:
* lat (lat) int32 5 5
* lon (lon) int32 9 10 11 12),
(6,
<xarray.dataarray (lat: 2, lon: 4)>
array([[ 5, 6, 7, 8],
[13, 14, 15, 16]])
Coordinates:
* lat (lat) int32 6 6
* lon (lon) int32 9 10 11 12)]</xarray.dataarray (lat: 2, lon: 4)></xarray.dataarray (lat: 2, lon: 4)>
39.对上一题聚合后的结果求平均
da.groupby('lat').mean()
D:\Anaconda3\lib\site-packages\xarray\core\groupby.py:639: FutureWarning: Default reduction dimension will be changed to the grouped dimension in a future version of xarray. To silence this warning, pass dim=xarray.ALL_DIMS explicitly.
skipna=skipna, allow_lazy=True, **kwargs)
array([ 6.5, 10.5])
Coordinates:
* lat (lat) int64 5 6
40.指定ds
中的lon
维度做聚合
lat = [5,6,7,8]
lon = [9,20,20,20]
temperature = np.ones((4,4))*(-1)
ds = xr.Dataset(
data_vars={'temperature': (('lat', 'lon'),temperature)},
coords={'lat':('lat',lat),
'lon':('lon',lon) })
ds
Dimensions: (lat: 4, lon: 4)
Coordinates:
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9 20 20 20
Data variables:
temperature (lat, lon) float64 -1.0 -1.0 -1.0 -1.0 ... -1.0 -1.0 -1.0 -1.0
ds.groupby('lon')
<xarray.core.groupby.datasetgroupby at 0x1fb6d59af98></xarray.core.groupby.datasetgroupby at 0x1fb6d59af98>
list(ds.groupby('lon'))
[(9,
Dimensions: (lat: 4, lon: 1)
Coordinates:
* lat (lat) int32 5 6 7 8
* lon (lon) int32 9
Data variables:
temperature (lat, lon) float64 -1.0 -1.0 -1.0 -1.0),
(20,
Dimensions: (lat: 4, lon: 3)
Coordinates:
* lat (lat) int32 5 6 7 8
* lon (lon) int32 20 20 20
Data variables:
temperature (lat, lon) float64 -1.0 -1.0 -1.0 -1.0 ... -1.0 -1.0 -1.0 -1.0)]
五、缺失值处理、插值
41.查看da
中哪些值是缺失值
da = xr.DataArray([3,4,np.nan,6,np.nan,8],dims=['lat'])
da
array([ 3., 4., nan, 6., nan, 8.])
Dimensions without coordinates: lat
da.isnull()
array([False, False, True, False, True, False])
Dimensions without coordinates: lat
42.查看上一题的da
中哪些值不是缺失值
da.notnull()
array([ True, True, False, True, False, True])
Dimensions without coordinates: lat
43.查看41题的da
中非缺失值元素的个数
da.count()
array(4)
44.去除41题的da
中lat
维度的缺失值
da.dropna(dim='lat')
array([3., 4., 6., 8.])
Dimensions without coordinates: lat
45.填充41题的da
中的缺失值
da.fillna(999)
array([ 3., 4., 999., 6., 999., 8.])
Dimensions without coordinates: lat
46.使用线性方法插值41题的da
中的缺失值
da.interpolate_na(dim='lat',method='linear')
array([3., 4., 5., 6., 7., 8.])
Dimensions without coordinates: lat
六、拼接
47.沿着day
维度拼接da_1
和da_2
day = ['day1','day2','day3','day4']
da_1 = xr.DataArray([1,2,3,4],dims=['day'],coords=[day])
day = ['day5','day6','day7','day8']
da_2 = xr.DataArray([5,6,7,8],dims=['day'],coords=[day])
print(da_1)
print('-----'*5)
print(da_2)
array([1, 2, 3, 4])
Coordinates:
* day (day) <u4 'day1' 'day2' 'day3' 'day4'
-------------------------
array([5, 6, 7, 8])
Coordinates:
* day (day) <u4 'day5' 'day6' 'day7' 'day8'< code=""></u4 'day5' 'day6' 'day7' 'day8'<></u4 'day1' 'day2' 'day3' 'day4'
xr.concat([da_1,da_2],dim='day')
array([1, 2, 3, 4, 5, 6, 7, 8])
Coordinates:
* day (day) object 'day1' 'day2' 'day3' 'day4' ... 'day6' 'day7' 'day8'
48.沿着day维度拼接ds_1中的temperature1变量和ds_2中的temperature2变量
day = ['day1','day2','day3','day4']
da_1 = xr.DataArray([1,2,3,4],dims=['day'],coords=[day])
ds_1 = da_1.to_dataset(name = 'temperature1')
day = ['day5','day6','day7','day8']
da_2 = xr.DataArray([5,6,7,8],dims=['day'],coords=[day])
ds_2 = da_2.to_dataset(name = 'temperature2')
print(ds_1)
print('-----'*5)
print(ds_2)
Dimensions: (day: 4)
Coordinates:
* day (day) <u4 'day1' 'day2' 'day3' 'day4'
Data variables:
temperature1 (day) int32 1 2 3 4
-------------------------
Dimensions: (day: 4)
Coordinates:
* day (day) <u4 'day5' 'day6' 'day7' 'day8'
Data variables:
temperature2 (day) int32 5 6 7 8</u4 'day5' 'day6' 'day7' 'day8'
</u4 'day1' 'day2' 'day3' 'day4'
xr.concat([ds_1['temperature1'],ds_2['temperature2']],dim='day')
array([1, 2, 3, 4, 5, 6, 7, 8])
Coordinates:
* day (day) object 'day1' 'day2' 'day3' 'day4' ... 'day6' 'day7' 'day8'
49.沿着day维度拼接ds_1和ds_2(两个Dataset具有相同的变量)
day = ['day1','day2','day3','day4']
da_1 = xr.DataArray([1,2,3,4],dims=['day'],coords=[day])
ds_1 = da_1.to_dataset(name = 'temperature')
day = ['day5','day6','day7','day8']
da_2 = xr.DataArray([5,6,7,8],dims=['day'],coords=[day])
ds_2 = da_2.to_dataset(name = 'temperature')
print(ds_1)
print('-----'*5)
print(ds_2)
Dimensions: (day: 4)
Coordinates:
* day (day) <u4 'day1' 'day2' 'day3' 'day4'
Data variables:
temperature (day) int32 1 2 3 4
-------------------------
Dimensions: (day: 4)
Coordinates:
* day (day) <u4 'day5' 'day6' 'day7' 'day8'
Data variables:
temperature (day) int32 5 6 7 8</u4 'day5' 'day6' 'day7' 'day8'
</u4 'day1' 'day2' 'day3' 'day4'
#方法一
xr.concat([ds_1,ds_2],dim='day')
Dimensions: (day: 8)
Coordinates:
* day (day) object 'day1' 'day2' 'day3' ... 'day6' 'day7' 'day8'
Data variables:
temperature (day) int32 1 2 3 4 5 6 7 8
#方法二
xr.merge([ds_1,ds_2])
Dimensions: (day: 8)
Coordinates:
* day (day) object 'day1' 'day2' 'day3' ... 'day6' 'day7' 'day8'
Data variables:
temperature (day) float64 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0
50.沿着day维度拼接ds_1和ds_2(两个Dataset具有不同的变量)
day = ['day1','day2','day3','day4']
da_1 = xr.DataArray([1,2,3,4],dims=['day'],coords=[day])
ds_1 = da_1.to_dataset(name = 'temperature1')
day = ['day5','day6','day7','day8']
da_2 = xr.DataArray([5,6,7,8],dims=['day'],coords=[day])
ds_2 = da_2.to_dataset(name = 'temperature2')
print(ds_1)
print('-----'*5)
print(ds_2)
Dimensions: (day: 4)
Coordinates:
* day (day) <u4 'day1' 'day2' 'day3' 'day4'
Data variables:
temperature1 (day) int32 1 2 3 4
-------------------------
Dimensions: (day: 4)
Coordinates:
* day (day) <u4 'day5' 'day6' 'day7' 'day8'
Data variables:
temperature2 (day) int32 5 6 7 8</u4 'day5' 'day6' 'day7' 'day8'
</u4 'day1' 'day2' 'day3' 'day4'
xr.merge([ds_1,ds_2])
Dimensions: (day: 8)
Coordinates:
* day (day) object 'day1' 'day2' 'day3' ... 'day6' 'day7' 'day8'
Data variables:
temperature1 (day) float64 1.0 2.0 3.0 4.0 nan nan nan nan
temperature2 (day) float64 nan nan nan nan 5.0 6.0 7.0 8.0