// the find
opendatacube/datacube-core
Open Data Cube analyses continental scale Earth Observation data through time
Open Data Cube is a Python framework for indexing, cataloging, and loading continental-scale Earth observation raster data (Landsat, Sentinel, MODIS, etc.) against a PostGIS/PostgreSQL backend. It abstracts the metadata catalog, spatial indexing, and gridded loading into a single API, returning xarray Datasets aligned to a common grid. Aimed at national agencies, research institutions, and anyone doing time-series satellite analysis at scale.
The pluggable index backend design (postgres, postgis, memory, null) is genuinely well thought out — abstract base classes in `datacube/index/abstract/` mean you can swap backends without touching analysis code. Native Dask integration in the loading layer means you get lazy, chunked loading of TB-scale datasets without changing the API call. The virtual product system (`datacube/virtual/`) lets you compose derived products (band math, masking, resampling) declaratively in YAML, which is useful when you have dozens of downstream pipelines all doing the same preprocessing. Active deployments at Digital Earth Australia and DE Africa mean the v1.9 migration path is battle-tested, not theoretical.
The two-database-driver situation (postgres vs. postgis) is a real headache — the legacy `postgres` driver is still present and partially supported, which means docs and examples sometimes point at the wrong one. Setup friction is high: you need Conda/Mamba, PostGIS, the GDAL stack, and indexed product definitions before you can load a single pixel; there's no quick-start that actually works end-to-end. The metadata type / product type / dataset distinction trips up every new user and the YAML schema is underdocumented relative to how load-bearing it is. STAC support exists (`_stacconverter.py`) but it's clearly bolted on rather than first-class, so if your data pipeline starts from a STAC catalog you'll be fighting the impedance mismatch.