In-Memory Files
Other sections of this documentation have explained how Rasterio can access data stored in existing files on disk written by other programs or write files to be used by other GIS programs. Filenames have been the typical inputs and files on disk have been the typical outputs.
with rasterio.open('example.tif') as dataset:
data_array = dataset.read()
There are different options for Python programs that have streams of bytes, e.g., from a network socket, as their input or output instead of filenames. One is the use of a temporary file on disk.
import tempfile
with tempfile.NamedTemporaryFile() as tmpfile:
tmpfile.write(data)
with rasterio.open(tmpfile.name) as dataset:
data_array = dataset.read()
Another is Rasterio’s MemoryFile
, an abstraction for objects in GDAL’s
in-memory filesystem.
MemoryFile: BytesIO meets NamedTemporaryFile
The MemoryFile
class behaves a bit like BytesIO
and
NamedTemporaryFile()
. A GeoTIFF file in a sequence of data
bytes can be
opened in memory as shown below.
from rasterio.io import MemoryFile
with MemoryFile(data) as memfile:
with memfile.open() as dataset:
data_array = dataset.read()
This code can be several times faster than the code using
NamedTemporaryFile()
at roughly double the price in memory.
Writing MemoryFiles
Incremental writes to an empty MemoryFile
are also possible.
with MemoryFile() as memfile:
while True:
data = f.read(8192) # ``f`` is an input stream.
if not data:
break
memfile.write(data)
with memfile.open() as dataset:
data_array = dataset.read()
These two modes are incompatible: a MemoryFile
initialized with a sequence
of bytes cannot be extended.
An empty MemoryFile
can also be written to using dataset API methods.
with MemoryFile() as memfile:
with memfile.open(driver='GTiff', count=3, ...) as dataset:
dataset.write(data_array)
Reading MemoryFiles
Like BytesIO
, MemoryFile
implements the Python file protocol and
provides read()
, seek()
, and tell()
methods. Instances are thus suitable as arguments for methods like
requests.post().
with MemoryFile() as memfile:
with memfile.open(driver='GTiff', count=3, ...) as dataset:
dataset.write(data_array)
requests.post('https://example.com/upload', data=memfile)