For many spatial models, it is common to take a random sample of the predictors to represent a single class (i.e. an environmental background or pseudo-absences in a binary classification model). The sample function is supplied in the sampling module for this purpose:
from pyspatialml import Rasterimport pyspatialml.datasets.nc as ncimport matplotlib.pyplot as pltpredictors = [nc.band1, nc.band2, nc.band3, nc.band4, nc.band5, nc.band7]stack = Raster(predictors)# extract training data using a random sampledf_rand = stack.sample(size=1000, random_state=1)df_rand.plot()
Stratified Random Sampling
The sample function also enables stratified random sampling based on passing a categorical raster dataset to the strata argument. The categorical raster should spatially overlap with the dataset to be sampled, but it does not need to be of the same grid resolution. This raster should be passed as a opened rasterio dataset: