Tutorial

Intake-stac simply provides a thin interface that combines sat-stac and Intake. It’s basic usage is shown below:

To begin, import intake:

In [1]: import intake

Loading a catalog

You can load data from a STAC Catalog by providing the URL to valid STAC Catalog:

In [2]: url = 'https://raw.githubusercontent.com/cholmes/sample-stac/master/stac/catalog.json'

In [3]: catalog = intake.open_stac_catalog(url)

In [4]: list(catalog)
Out[4]: ['hurricane-harvey']

You can also point to STAC Collections or Items. Each constructor returns a Intake Catalog with the top level corresponding to the STAC object used for initialization:

In [5]: root_url = 'https://raw.githubusercontent.com/sat-utils/sat-stac/master/test/catalog'

In [6]: stac_cat = intake.open_stac_catalog(
   ...:     f'{root_url}/catalog.json',
   ...: )
   ...: 

In [7]: collection_cat = intake.open_stac_collection(
   ...:     f'{root_url}/eo/landsat-8-l1/catalog.json',
   ...: )
   ...: 

In [8]: items_cat = intake.open_stac_item(
   ...:     f'{root_url}/eo/landsat-8-l1/item.json'
   ...: )
   ...: 

Intake-Stac uses sat-stac to parse STAC objects. You can also pass satstac objects (e.g. satstac.Collection) directly to the Intake-stac constructors:

In [9]: import satstac

In [10]: col = satstac.Collection.open(f'{root_url}/catalog.json')

In [11]: collection_cat = intake.open_stac_collection(col)

Using the catalog

Once you have a catalog, you can display its entries by iterating through its contents:

In [12]: print(list(catalog))
['hurricane-harvey']

In [13]: cat = catalog['hurricane-harvey']

In [14]: print(list(cat))
['hurricane-harvey-0831']

In [15]: subcat = cat['hurricane-harvey-0831']

In [16]: items = list(subcat)

In [17]: print(items)
['20170831_172754_101c', '2017831_195552_SS02', '20170831_195425_SS02', '20170831_162740_ssc1d1', 'Houston-East-20170831-103f-100d-0f4f-RGB']

When you locate an item of interest, you have access to metadata and methods to load assets into Python objects

In [18]: item = subcat['Houston-East-20170831-103f-100d-0f4f-RGB']

In [19]: print(type(item))
<class 'intake_stac.catalog.StacItem'>

In [20]: print(item.metadata)
{'datetime': datetime.datetime(2017, 8, 31, 17, 24, 57, 555491, tzinfo=tzlocal()), 'constellation': 'planetscope', 'instruments': ['PS2'], 'gsd': 3.7, 'eo:cloud_cover': 2, 'view:sun_azimuth': 145.5, 'view:sun_elevation': 64.9, 'view:off_nadir': 0.2, 'proj:epsg_code': 32615, 'pl:ground_control': True, 'bbox': [-95.73737276800716, 29.561332400220497, -95.05332428370095, 30.157560439570304], 'geometry': {'type': 'Polygon', 'coordinates': [[[-95.73737276800716, 30.14525788823348], [-95.06532619920118, 30.157560439570304], [-95.05332428370095, 29.57334931237589], [-95.7214758280382, 29.561332400220497], [-95.73737276800716, 30.14525788823348]]]}, 'date': datetime.date(2017, 8, 31), 'catalog_dir': ''}

In [21]: assets = list(item)

In [22]: print(assets)
['thumbnail', 'mosaic']

In [23]: asset = item['thumbnail']

In [24]: print(type(asset))
<class 'intake_xarray.image.ImageSource'>

In [25]: print(asset.urlpath)
https://storage.googleapis.com/pdd-stac/disasters/hurricane-harvey/0831/Houston-East-20170831-103f-100d-0f4f-3-band.png

If the catalog has too many entries to comfortably print all at once, you can narrow it by searching for a term (e.g. ‘thumbnail’):

In [26]: for id, entry in subcat.search('thumbnail').items():
   ....:     print(id)
   ....: 
20170831_172754_101c.thumbnail
2017831_195552_SS02.thumbnail
2017831_195552_SS02.full-jpg
20170831_195425_SS02.thumbnail
20170831_162740_ssc1d1.thumbnail
Houston-East-20170831-103f-100d-0f4f-RGB.thumbnail

In [27]: asset = subcat['Houston-East-20170831-103f-100d-0f4f-RGB.thumbnail']

In [28]: print(asset.urlpath)
https://storage.googleapis.com/pdd-stac/disasters/hurricane-harvey/0831/Houston-East-20170831-103f-100d-0f4f-3-band.png

Loading a dataset

Once you have identified a dataset, you can load it into a xarray.DataArray using Intake’s to_dask() method. This reads only metadata, and streams values over the network when required by computations or visualizations:

In [29]: da = asset.to_dask()

In [30]: display(da)
<xarray.DataArray (y: 552, x: 549, channel: 3)>
dask.array<xarray-<this-array>, shape=(552, 549, 3), dtype=uint8, chunksize=(552, 549, 3), chunktype=numpy.ndarray>
Coordinates:
  * y        (y) int64 0 1 2 3 4 5 6 7 8 ... 543 544 545 546 547 548 549 550 551
  * x        (x) int64 0 1 2 3 4 5 6 7 8 ... 540 541 542 543 544 545 546 547 548
  * channel  (channel) int64 0 1 2