from pyspatialml import Raster
import pyspatialml.datasets.nc as nc
import matplotlib.pyplot as plt
= [nc.band1, nc.band2, nc.band3, nc.band4, nc.band5, nc.band7]
predictors = Raster(predictors)
stack
# extract training data using a random sample
= stack.sample(size=1000, random_state=1)
df_rand df_rand.plot()
Random Sampling
Random Uniform Sampling
For many spatial models, it is common to take a random sample of the predictors to represent a single class (i.e. an environmental background or pseudo-absences in a binary classification model). The sample function is supplied in the sampling module for this purpose:
Stratified Random Sampling
The sample function also enables stratified random sampling based on passing a categorical raster dataset to the strata argument. The categorical raster should spatially overlap with the dataset to be sampled, but it does not need to be of the same grid resolution. This raster should be passed as a opened rasterio dataset:
= Raster(nc.strata)
strata = stack.sample(size=5, strata=strata, random_state=1)
df_strata = df_strata.dropna()
df_strata df_strata
lsat7_2000_10 | lsat7_2000_20 | lsat7_2000_30 | lsat7_2000_40 | lsat7_2000_50 | lsat7_2000_70 | geometry | |
---|---|---|---|---|---|---|---|
0 | 96.0 | 78.0 | 88.0 | 49.0 | 71.0 | 63.0 | POINT (641093.250 225135.750) |
1 | 113.0 | 103.0 | 122.0 | 66.0 | 136.0 | 110.0 | POINT (640979.250 222342.750) |
3 | 82.0 | 66.0 | 67.0 | 64.0 | 76.0 | 52.0 | POINT (640095.750 225848.250) |
4 | 99.0 | 88.0 | 95.0 | 56.0 | 98.0 | 78.0 | POINT (637559.250 226788.750) |
5 | 81.0 | 69.0 | 76.0 | 73.0 | 118.0 | 72.0 | POINT (635621.250 218324.250) |
10 | 91.0 | 78.0 | 81.0 | 77.0 | 97.0 | 73.0 | POINT (634709.250 221943.750) |
11 | 72.0 | 61.0 | 51.0 | 104.0 | 91.0 | 47.0 | POINT (639269.250 220005.750) |
12 | 86.0 | 75.0 | 78.0 | 73.0 | 87.0 | 60.0 | POINT (639326.250 224964.750) |
13 | 71.0 | 53.0 | 48.0 | 59.0 | 78.0 | 46.0 | POINT (635222.250 218951.250) |
15 | 76.0 | 59.0 | 63.0 | 65.0 | 114.0 | 64.0 | POINT (633027.750 218580.750) |
17 | 75.0 | 61.0 | 55.0 | 70.0 | 74.0 | 43.0 | POINT (633369.750 219435.750) |
18 | 78.0 | 66.0 | 69.0 | 69.0 | 110.0 | 72.0 | POINT (633198.750 225506.250) |
19 | 68.0 | 52.0 | 40.0 | 79.0 | 58.0 | 30.0 | POINT (637986.750 222998.250) |
20 | 70.0 | 55.0 | 52.0 | 62.0 | 79.0 | 47.0 | POINT (635649.750 217440.750) |
22 | 71.0 | 53.0 | 48.0 | 64.0 | 77.0 | 42.0 | POINT (635564.250 222713.250) |
23 | 72.0 | 53.0 | 51.0 | 58.0 | 82.0 | 51.0 | POINT (633056.250 218324.250) |
26 | 81.0 | 78.0 | 79.0 | 34.0 | 41.0 | 28.0 | POINT (639297.750 223625.250) |
27 | 73.0 | 57.0 | 51.0 | 16.0 | 14.0 | 10.0 | POINT (635364.750 224736.750) |
28 | 73.0 | 57.0 | 52.0 | 55.0 | 57.0 | 40.0 | POINT (635535.750 223311.750) |
30 | 138.0 | 120.0 | 132.0 | 65.0 | 129.0 | 126.0 | POINT (634196.250 226190.250) |
31 | 72.0 | 60.0 | 47.0 | 69.0 | 82.0 | 46.0 | POINT (639810.750 219749.250) |
32 | 132.0 | 122.0 | 140.0 | 73.0 | 171.0 | 176.0 | POINT (640352.250 218238.750) |
33 | 170.0 | 157.0 | 176.0 | 80.0 | 182.0 | 183.0 | POINT (639924.750 219692.250) |
34 | 115.0 | 98.0 | 106.0 | 60.0 | 110.0 | 102.0 | POINT (639953.250 219578.250) |