Tutorial¶
Intake-stac simply provides a thin interface that combines sat-stac and Intake. It’s basic usage is shown below:
To begin, import intake:
In [1]: import intake
Loading a catalog¶
You can load data from a STAC Catalog by providing the URL to valid STAC Catalog:
In [2]: url = 'https://raw.githubusercontent.com/cholmes/sample-stac/master/stac/catalog.json'
In [3]: catalog = intake.open_stac_catalog(url)
In [4]: list(catalog)
Out[4]: ['hurricane-harvey']
You can also point to STAC Collections or Items. Each constructor returns a Intake Catalog with the top level corresponding to the STAC object used for initialization:
In [5]: root_url = 'https://raw.githubusercontent.com/sat-utils/sat-stac/master/test/catalog'
In [6]: stac_cat = intake.open_stac_catalog(
...: f'{root_url}/catalog.json',
...: )
...:
In [7]: collection_cat = intake.open_stac_collection(
...: f'{root_url}/eo/landsat-8-l1/catalog.json',
...: )
...:
In [8]: items_cat = intake.open_stac_item(
...: f'{root_url}/eo/landsat-8-l1/item.json'
...: )
...:
Intake-Stac uses sat-stac to parse
STAC objects. You can also pass satstac
objects (e.g.
satstac.Collection
) directly to the Intake-stac constructors:
In [9]: import satstac
In [10]: col = satstac.Collection.open(f'{root_url}/catalog.json')
In [11]: collection_cat = intake.open_stac_collection(col)
Using the catalog¶
Once you have a catalog, you can display its entries by iterating through its contents:
In [12]: print(list(catalog))
['hurricane-harvey']
In [13]: cat = catalog['hurricane-harvey']
In [14]: print(list(cat))
['hurricane-harvey-0831']
In [15]: subcat = cat['hurricane-harvey-0831']
In [16]: items = list(subcat)
In [17]: print(items)
['20170831_172754_101c', '2017831_195552_SS02', '20170831_195425_SS02', '20170831_162740_ssc1d1', 'Houston-East-20170831-103f-100d-0f4f-RGB']
When you locate an item of interest, you have access to metadata and methods to load assets into Python objects
In [18]: item = subcat['Houston-East-20170831-103f-100d-0f4f-RGB']
In [19]: print(type(item))
<class 'intake_stac.catalog.StacItem'>
In [20]: print(item.metadata)
{'datetime': datetime.datetime(2017, 8, 31, 17, 24, 57, 555491, tzinfo=tzlocal()), 'constellation': 'planetscope', 'instruments': ['PS2'], 'gsd': 3.7, 'eo:cloud_cover': 2, 'view:sun_azimuth': 145.5, 'view:sun_elevation': 64.9, 'view:off_nadir': 0.2, 'proj:epsg_code': 32615, 'pl:ground_control': True, 'bbox': [-95.73737276800716, 29.561332400220497, -95.05332428370095, 30.157560439570304], 'geometry': {'type': 'Polygon', 'coordinates': [[[-95.73737276800716, 30.14525788823348], [-95.06532619920118, 30.157560439570304], [-95.05332428370095, 29.57334931237589], [-95.7214758280382, 29.561332400220497], [-95.73737276800716, 30.14525788823348]]]}, 'date': datetime.date(2017, 8, 31), 'catalog_dir': ''}
In [21]: assets = list(item)
In [22]: print(assets)
['thumbnail', 'mosaic']
In [23]: asset = item['thumbnail']
In [24]: print(type(asset))
<class 'intake_xarray.image.ImageSource'>
In [25]: print(asset.urlpath)
https://storage.googleapis.com/pdd-stac/disasters/hurricane-harvey/0831/Houston-East-20170831-103f-100d-0f4f-3-band.png
If the catalog has too many entries to comfortably print all at once, you can narrow it by searching for a term (e.g. ‘thumbnail’):
In [26]: for id, entry in subcat.search('thumbnail').items():
....: print(id)
....:
20170831_172754_101c.thumbnail
2017831_195552_SS02.thumbnail
2017831_195552_SS02.full-jpg
20170831_195425_SS02.thumbnail
20170831_162740_ssc1d1.thumbnail
Houston-East-20170831-103f-100d-0f4f-RGB.thumbnail
In [27]: asset = subcat['Houston-East-20170831-103f-100d-0f4f-RGB.thumbnail']
In [28]: print(asset.urlpath)
https://storage.googleapis.com/pdd-stac/disasters/hurricane-harvey/0831/Houston-East-20170831-103f-100d-0f4f-3-band.png
Loading a dataset¶
Once you have identified a dataset, you can load it into a xarray.DataArray
using Intake’s to_dask() method. This reads only metadata, and streams values over the network when required by computations or visualizations:
In [29]: da = asset.to_dask()
In [30]: display(da)
<xarray.DataArray (y: 552, x: 549, channel: 3)>
dask.array<xarray-<this-array>, shape=(552, 549, 3), dtype=uint8, chunksize=(552, 549, 3), chunktype=numpy.ndarray>
Coordinates:
* y (y) int64 0 1 2 3 4 5 6 7 8 ... 543 544 545 546 547 548 549 550 551
* x (x) int64 0 1 2 3 4 5 6 7 8 ... 540 541 542 543 544 545 546 547 548
* channel (channel) int64 0 1 2
Working with sat-search¶
Intake-stac integrates with sat-search to faciliate dynamic search and discovery of assets through a STAC-API. To begin, construct a search query using sat-search:
In [31]: import satsearch
In [32]: print(satsearch.__version__)
0.3.0
In [33]: URL='https://earth-search.aws.element84.com/v0'
In [34]: results = satsearch.Search.search(
....: url=URL,
....: collections=['landsat-8-l1-c1'],
....: bbox=[43.16, -11.32, 43.54, -11.96]
....: )
....:
In [35]: items = results.items()
In [36]: items
Out[36]: <satstac.itemcollection.ItemCollection at 0x7effe58b8090>
In the code section above, items is a satstac.ItemsCollection object. Intake-stac can turn this object into an Intake catalog:
In [37]: catalog = intake.open_stac_item_collection(items)
In [38]: list(catalog)
Out[38]:
['LC08_L1TP_162068_20201210_20201218_01_T1',
'LC08_L1TP_163068_20201201_20201217_01_T1',
'LC08_L1TP_162068_20201124_20201210_01_T1',
'LC08_L1TP_163068_20201115_20201209_01_T1',
'LO08_L1TP_162068_20201108_20201120_01_T1',
'LC08_L1TP_162068_20201023_20201105_01_T1',
'LC08_L1TP_162068_20201007_20201016_01_T1',
'LC08_L1TP_163068_20200928_20201007_01_T1',
'LC08_L1TP_162068_20200921_20201006_01_T1',
'LC08_L1TP_162068_20200905_20200917_01_T1']