Reading Datasets

Dataset objects provide read, read-write, and write access to raster data files and are obtained by calling rasterio.open(). That function mimics Python’s built-in open() and the dataset objects it returns mimic Python file objects.

>>> import rasterio
>>> src = rasterio.open('tests/data/RGB.byte.tif')
>>> src
<open DatasetReader name='tests/data/RGB.byte.tif' mode='r'>
>>> src.name
'tests/data/RGB.byte.tif'
>>> src.mode
'r'
>>> src.closed
False

If you try to access a nonexistent path, rasterio.open() does the same thing as open(), raising an exception immediately.

>>> open('/lol/wut.tif')
Traceback (most recent call last):
 ...
FileNotFoundError: [Errno 2] No such file or directory: '/lol/wut.tif'
>>> rasterio.open('/lol/wut.tif')
Traceback (most recent call last):
 ...
rasterio.errors.RasterioIOError: No such file or directory

Datasets generally have one or more bands (or layers). Following the GDAL convention, these are indexed starting with the number 1. The first band of a file can be read like this:

>>> array = src.read(1)
>>> array.shape
(718, 791)

The returned object is a 2-dimensional Numpy ndarray. The representation of that array at the Python prompt is a summary; the GeoTIFF file that Rasterio uses for testing has 0 values in the corners, but has nonzero values elsewhere.

>>> from matplotlib import pyplot
>>> pyplot.imshow(array, cmap='pink')
<matplotlib.image.AxesImage object at 0x...>
>>> pyplot.show()  
http://farm6.staticflickr.com/5032/13938576006_b99b23271b_o_d.png

Instead of reading single bands, all bands of the input dataset can be read into a 3-dimensonal ndarray. Note that the interpretation of the 3 axes is (bands, rows, columns). See Image processing software for more details on how to convert to the ordering expected by some software.

>>> array = src.read()
>>> array.shape
(3, 718, 791)

In order to read smaller chunks of the dataset, refer to Windowed reading and writing.

The indexes, Numpy data types, and nodata values of all a dataset’s bands can be had from its indexes, dtypes, and nodatavals attributes.

>>> for i, dtype, nodataval in zip(src.indexes, src.dtypes, src.nodatavals):
...     print(i, dtype, nodataval)
...
1 uint8 0.0
2 uint8 0.0
3 uint8 0.0

To close a dataset, call its close() method.

>>> src.close()
>>> src
<closed DatasetReader name='tests/data/RGB.byte.tif' mode='r'>

After it’s closed, data can no longer be read.

>>> src.read(1)
Traceback (most recent call last):
 ...
ValueError: can't read closed raster file

This is the same behavior as Python’s file.

>>> f = open('README.rst')
>>> f.close()
>>> f.read()
Traceback (most recent call last):
 ...
ValueError: I/O operation on closed file.

As Python file objects can, Rasterio datasets can manage the entry into and exit from runtime contexts created using a with statement. This ensures that files are closed no matter what exceptions may be raised within the the block.

>>> with rasterio.open('tests/data/RGB.byte.tif', 'r') as one:
...     with rasterio.open('tests/data/RGB.byte.tif', 'r') as two:
...        print(two)
...     print(one)
<open DatasetReader name='tests/data/RGB.byte.tif' mode='r'>
<open DatasetReader name='tests/data/RGB.byte.tif' mode='r'>

>>> print(two)
<closed DatasetReader name='tests/data/RGB.byte.tif' mode='r'>
>>> print(one)
<closed DatasetReader name='tests/data/RGB.byte.tif' mode='r'>