Advanced Datasets
The analogy of Python file objects influences the design of Rasterio dataset
objects. Datasets of a few different kinds exist and the canonical way to
obtain one is to call rasterio.open()
with a path-like object or URI-like
identifier, a mode (such as “r” or “w”), and other keyword arguments.
Dataset Identifiers
Datasets in a computer’s filesystem are identified by paths, “file” URLs,
or instances of pathlib.Path
. The following are equivalent.
'/path/to/file.tif'
'file:///path/to/file.tif'
pathlib.Path('/path/to/file.tif')
Datasets within a local zip file are identified using the “zip” scheme from Apache Commons VFS.
'zip:///path/to/file.zip!/folder/file.tif'
'zip+file:///path/to/file.zip!/folder/file.tif'
Note that !
is the separator between the path of the archive file and the
path within the archive file. Also note that his kind of identifier can’t be expressed using
pathlib.
Similarly, variables of a netCDF dataset can be accessed using “netcdf” scheme identifiers.
'netcdf:/path/to/file.nc:variable'
Datasets on the web are identified by “http” or “https” URLs such as
'https://example.com/file.tif'
'https://landsat-pds.s3.amazonaws.com/L8/139/045/LC81390452014295LGN00/LC81390452014295LGN00_B1.TIF'
Datasets within a zip file on the web
are identified using a “zip+https” scheme and paths separated by !
as above.
For example:
'zip+https://example.com/file.tif&p=x&q=y!/folder/file.tif'
Datasets on AWS S3 may be identified using “s3” scheme identifiers.
's3://landsat-pds/L8/139/045/LC81390452014295LGN00/LC81390452014295LGN00_B1.TIF'
Resources in other cloud storage systems will be similarly supported.