rasterio.features module

Functions for working with features in a raster dataset.

rasterio.features.bounds(geometry, north_up=True, transform=None)

Return a (left, bottom, right, top) bounding box.

From Fiona 1.4.8. Modified to return bbox from geometry if available.

Parameters:

geometry (GeoJSON-like feature (implements __geo_interface__),) – feature collection, or geometry.

Returns:

Bounding box: (left, bottom, right, top)

Return type:

tuple

rasterio.features.dataset_features(src, bidx=None, sampling=1, band=True, as_mask=False, with_nodata=False, geographic=True, precision=-1)

Yield GeoJSON features for the dataset

The geometries are polygons bounding contiguous regions of the same raster value.

Parameters:
  • src (Rasterio Dataset)

  • bidx (int) – band index

  • sampling (int (DEFAULT: 1)) – Inverse of the sampling fraction; a value of 10 decimates

  • band (boolean (DEFAULT: True)) – extract features from a band (True) or a mask (False)

  • as_mask (boolean (DEFAULT: False)) – Interpret band as a mask and output only one class of valid data shapes?

  • with_nodata (boolean (DEFAULT: False)) – Include nodata regions?

  • geographic (str (DEFAULT: True)) – Output shapes in EPSG:4326? Otherwise use the native CRS.

  • precision (int (DEFAULT: -1)) – Decimal precision of coordinates. -1 for full float precision output

Yields:

GeoJSON-like Feature dictionaries for shapes found in the given band

rasterio.features.geometry_mask(geometries, out_shape, transform, all_touched=False, invert=False)

Create a mask from shapes.

By default, mask is intended for use as a numpy mask, where pixels that overlap shapes are False.

Parameters:
  • geometries (iterable over geometries (GeoJSON-like objects))

  • out_shape (tuple or list) – Shape of output numpy.ndarray.

  • transform (Affine transformation object) – Transformation from pixel coordinates of source to the coordinate system of the input shapes. See the transform property of dataset objects.

  • all_touched (boolean, optional) – If True, all pixels touched by geometries will be burned in. If False, only pixels whose center is within the polygon or that are selected by Bresenham’s line algorithm will be burned in. False by default

  • invert (boolean, optional) – If True, mask will be True for pixels that overlap shapes. False by default.

Returns:

Type is numpy.bool_

Return type:

numpy.ndarray

Notes

See rasterize() for performance notes.

rasterio.features.geometry_window(dataset, shapes, pad_x=0, pad_y=0, north_up=None, rotated=None, pixel_precision=None, boundless=False)

Calculate the window within the raster that fits the bounds of the geometry plus optional padding. The window is the outermost pixel indices that contain the geometry (floor of offsets, ceiling of width and height).

If shapes do not overlap raster, a WindowError is raised.

Parameters:
  • dataset (dataset object opened in 'r' mode) – Raster for which the mask will be created.

  • shapes (iterable over geometries.) – A geometry is a GeoJSON-like object or implements the geo interface. Must be in same coordinate system as dataset.

  • pad_x (float) – Amount of padding (as fraction of raster’s x pixel size) to add to left and right side of bounds.

  • pad_y (float) – Amount of padding (as fraction of raster’s y pixel size) to add to top and bottom of bounds.

  • north_up (optional) – This parameter is ignored since version 1.2.1. A deprecation warning will be emitted in 1.3.0.

  • rotated (optional) – This parameter is ignored since version 1.2.1. A deprecation warning will be emitted in 1.3.0.

  • pixel_precision (int or float, optional) – Number of places of rounding precision or absolute precision for evaluating bounds of shapes.

  • boundless (bool, optional) – Whether to allow a boundless window or not.

Return type:

rasterio.windows.Window

rasterio.features.is_valid_geom(geom)

Checks to see if geometry is a valid GeoJSON geometry type or GeometryCollection. Geometry must be GeoJSON or implement the geo interface.

Geometries must be non-empty, and have at least x, y coordinates.

Note: only the first coordinate is checked for validity.

Parameters:

geom (an object that implements the geo interface or GeoJSON-like object)

Returns:

bool

Return type:

True if object is a valid GeoJSON geometry type

rasterio.features.rasterize(shapes, out_shape=None, fill=0, out=None, transform=(1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0), all_touched=False, merge_alg=MergeAlg.replace, default_value=1, dtype=None, skip_invalid=True)

Return an image array with input geometries burned in.

Warnings will be raised for any invalid or empty geometries, and an exception will be raised if there are no valid shapes to rasterize.

Parameters:
  • shapes (iterable of (geometry, value) pairs or geometries) – The geometry can either be an object that implements the geo interface or GeoJSON-like object. If no value is provided the default_value will be used. If value is None the fill value will be used.

  • out_shape (tuple or list with 2 integers) – Shape of output numpy.ndarray.

  • fill (int or float, optional) – Used as fill value for all areas not covered by input geometries.

  • out (numpy.ndarray, optional) – Array in which to store results. If not provided, out_shape and dtype are required.

  • transform (Affine transformation object, optional) – Transformation from pixel coordinates of source to the coordinate system of the input shapes. See the transform property of dataset objects.

  • all_touched (boolean, optional) – If True, all pixels touched by geometries will be burned in. If false, only pixels whose center is within the polygon or that are selected by Bresenham’s line algorithm will be burned in.

  • merge_alg (MergeAlg, optional) –

    Merge algorithm to use. One of:
    MergeAlg.replace (default):

    the new value will overwrite the existing value.

    MergeAlg.add:

    the new value will be added to the existing raster.

  • default_value (int or float, optional) – Used as value for all geometries, if not provided in shapes.

  • dtype (rasterio or numpy.dtype, optional) – Used as data type for results, if out is not provided.

  • skip_invalid (bool, optional) – If True (default), invalid shapes will be skipped. If False, ValueError will be raised.

Returns:

If out was not None then out is returned, it will have been modified in-place. If out was None, this will be a new array.

Return type:

numpy.ndarray

Notes

Valid data types for fill, default_value, out, dtype and shape values are “int16”, “int32”, “uint8”, “uint16”, “uint32”, “float32”, and “float64”.

This function requires significant memory resources. The shapes iterator will be materialized to a Python list and another C copy of that list will be made. The out array will be copied and additional temporary raster memory equal to 2x the smaller of out data or GDAL’s max cache size (controlled by GDAL_CACHEMAX, default is 5% of the computer’s physical memory) is required.

If GDAL max cache size is smaller than the output data, the array of shapes will be iterated multiple times. Performance is thus a linear function of buffer size. For maximum speed, ensure that GDAL_CACHEMAX is larger than the size of out or out_shape.

rasterio.features.shapes(source, mask=None, connectivity=4, transform=(1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0))

Get shapes and values of connected regions in a dataset or array.

Parameters:
  • source (numpy.ndarray, dataset object, Band, or tuple(dataset, bidx)) – Data type must be one of rasterio.int16, rasterio.int32, rasterio.uint8, rasterio.uint16, or rasterio.float32.

  • mask (numpy.ndarray or rasterio Band object, optional) – Must evaluate to bool (rasterio.bool_ or rasterio.uint8). Values of False or 0 will be excluded from feature generation. Note well that this is the inverse sense from Numpy’s, where a mask value of True indicates invalid data in an array. If source is a numpy.ma.MaskedArray and mask is None, the source’s mask will be inverted and used in place of mask.

  • connectivity (int, optional) – Use 4 or 8 pixel connectivity for grouping pixels into features

  • transform (Affine transformation, optional) – If not provided, feature coordinates will be generated based on pixel coordinates

Yields:

polygon, value – A pair of (polygon, value) for each feature found in the image. Polygons are GeoJSON-like dicts and the values are the associated value from the image, in the data type of the image. Note: due to floating point precision issues, values returned from a floating point image may not exactly match the original values.

Notes

The amount of memory used by this algorithm is proportional to the number and complexity of polygons produced. This algorithm is most appropriate for simple thematic data. Data with high pixel-to-pixel variability, such as imagery, may produce one polygon per pixel and consume large amounts of memory.

Because the low-level implementation uses either an int32 or float32 buffer, uint32 and float64 data cannot be operated on without truncation issues.

rasterio.features.sieve(source, size, out=None, mask=None, connectivity=4)

Remove small polygon regions from a raster.

Polygons are found for each set of neighboring pixels of the same value.

Parameters:
  • source (ndarray, dataset, or Band) – The source is a 2 or 3-D ndarray, a dataset opened in “r” mode, or a single or a multiple Rasterio Band object. Must be of type rasterio.int16, rasterio.int32, rasterio.uint8, rasterio.uint16, or rasterio.float32

  • size (int) – minimum polygon size (number of pixels) to retain.

  • out (numpy ndarray, optional) – Array of same shape and data type as source in which to store results.

  • mask (numpy ndarray or rasterio Band object, optional) – Values of False or 0 will be excluded from feature generation Must evaluate to bool (rasterio.bool_ or rasterio.uint8)

  • connectivity (int, optional) – Use 4 or 8 pixel connectivity for grouping pixels into features

Returns:

out – Result

Return type:

numpy.ndarray

Notes

GDAL only supports values that can be cast to 32-bit integers for this operation.

The amount of memory used by this algorithm is proportional to the number and complexity of polygons found in the image. This algorithm is most appropriate for simple thematic data. Data with high pixel-to-pixel variability, such as imagery, may produce one polygon per pixel and consume large amounts of memory.