Quick start

m q— title: “Quick start” format: html: code-fold: false toc: true jupyter: python3


Initiating a Raster Object

We are going to use a set of Landsat 7 bands contained within the nc example data:

from pyspatialml import Raster
import pyspatialml.datasets.nc as nc
import matplotlib.pyplot as plt

predictors = [nc.band1, nc.band2, nc.band3, nc.band4, nc.band5, nc.band7]

These raster datasets are aligned in terms of their extent and coordinate reference systems. We can ‘stack’ these into a Raster class so that we can perform machine learning related operations on the set of rasters:

stack = Raster(predictors)

When a Raster object is created, the names to each layer are automatically created based on syntactically-correct versions of the file basenames:

stack.names
dict_keys(['lsat7_2000_10', 'lsat7_2000_20', 'lsat7_2000_30', 'lsat7_2000_40', 'lsat7_2000_50', 'lsat7_2000_70'])

Color ramps and matplotlib.colors.Normalize objects can be assigned to each RasterLayer in the object using the cmap and norm attributes for convenient in plotting:

stack.lsat7_2000_10.cmap = "Blues"
stack.lsat7_2000_20.cmap = "Greens"
stack.lsat7_2000_30.cmap = "Reds"
stack.lsat7_2000_40.cmap = "RdPu"
stack.lsat7_2000_50.cmap = "autumn"
stack.lsat7_2000_70.cmap = "hot"

stack.plot(
    title_fontsize=8,
    label_fontsize=6,
    legend_fontsize=6,
    names=["B1", "B2", "B3", "B4", "B5", "B7"],
    fig_kwds={"figsize": (8, 4)},
    subplots_kwds={"wspace": 0.3}
)
plt.show()

Subsetting and Indexing

Indexing of Raster objects is provided by several methods:

The Raster[keys] method enables key-based indexing using a name of a RasterLayer, or a list of names. Direct subsetting of a Raster object instance returns a RasterLayer if only a single label is used, otherwise it always returns a new Raster object containing only the selected layers.

The Raster.iloc[int, list, tuple, slice] method allows a Raster object instance to be subset using integer-based indexing or slicing. The iloc method returns a RasterLayer object if only a single index is used, otherwise it always returns a new Raster object containing only the selected layers.

Subsetting of a Raster object instance can also occur by using attribute names in the form of Raster.name_of_layer. Because only a single RasterLayer can be subset at one time using this approach, a RasterLayer object is always returned.

Examples of methods to subset a Raster object:

# subset based on position
single_layer = stack.iloc[0]

# subset using a slice
new_raster_obj = stack.iloc[0:3]

# subset using labels
single_layer = stack['lsat7_2000_10']
single_layer = stack.lsat7_2000_10

# list or tuple of keys
new_raster_obj = stack[('lsat7_2000_10', 'lsat7_2000_20')]

Iterate through RasterLayers individually:

for name, layer in stack.items():
    print(name, layer)
lsat7_2000_10 <pyspatialml.rasterlayer.RasterLayer object at 0xff5c4aa89190>
lsat7_2000_20 <pyspatialml.rasterlayer.RasterLayer object at 0xff5c4981ef90>
lsat7_2000_30 <pyspatialml.rasterlayer.RasterLayer object at 0xff5c3ceac5d0>
lsat7_2000_40 <pyspatialml.rasterlayer.RasterLayer object at 0xff5c423c2950>
lsat7_2000_50 <pyspatialml.rasterlayer.RasterLayer object at 0xff5c3cb0d310>
lsat7_2000_70 <pyspatialml.rasterlayer.RasterLayer object at 0xff5c3cb0d850>

Replace a RasterLayer with another:

stack.iloc[0] = Raster(nc.band7).iloc[0]

stack.iloc[0].plot()
plt.show()

Appending and Dropping Layers

Append layers from another Raster to the stack. Duplicate names are automatically given a suffix.

stack.append(Raster(nc.band7), in_place=True)
stack.names
dict_keys(['lsat7_2000_10', 'lsat7_2000_20', 'lsat7_2000_30', 'lsat7_2000_40', 'lsat7_2000_50', 'lsat7_2000_70_1', 'lsat7_2000_70_2'])

Rename RasterLayers using a dict of old_name : new_name pairs:

stack.names
stack.rename({'lsat7_2000_30': 'new_name'}, in_place=True)
stack.names
stack.new_name
stack['new_name']
<pyspatialml.rasterlayer.RasterLayer at 0xff5c3ceac5d0>

Drop a RasterLayer:

stack.names
stack.drop(labels='lsat7_2000_70_1', in_place=True)
stack.names
dict_keys(['lsat7_2000_10', 'lsat7_2000_20', 'new_name', 'lsat7_2000_40', 'lsat7_2000_50', 'lsat7_2000_70_2'])

Integration with Pandas

Data from a Raster object can converted into a Pandas.DataDrame, with each pixel representing by a row, and columns reflecting the x, y coordinates and the values of each RasterLayer in the Raster object:

import pandas as pd

df = stack.to_pandas(max_pixels=50000, resampling='nearest')
df.head()
x y lsat7_2000_10 lsat7_2000_20 new_name lsat7_2000_40 lsat7_2000_50 lsat7_2000_70_2
0 630534.000000 228114.0 NaN NaN NaN NaN NaN NaN
1 630562.558402 228114.0 NaN NaN NaN NaN NaN NaN
2 630591.116803 228114.0 NaN NaN NaN NaN NaN NaN
3 630619.675205 228114.0 NaN NaN NaN NaN NaN NaN
4 630648.233607 228114.0 NaN NaN NaN NaN NaN NaN

The original raster is up-sampled based on max_pixels and the resampling method, which uses all of resampling methods available in the underlying rasterio library for decimated reads.

Saving a Raster to File

Save a Raster:

import tempfile

tmp_tif = tempfile.NamedTemporaryFile().name + '.tif'
newstack = stack.write(file_path=tmp_tif, nodata=-9999)
newstack.new_name.read()
newstack = None