conferences | speakers | series

The ecosystem of geospatial machine learning tools in the Pangeo world

home

The ecosystem of geospatial machine learning tools in the Pangeo world
FOSS4G SOTM Oceania 2023

Several open source tools are enabling the shift to cloud-native geospatial Machine Learning workflows. Stream data from STAC APIs, generate Machine Learning ready chips on-the-fly and train models for different downstream tasks! Find out about advances in the Pangeo ML community towards scalable GPU-native workflows.

An overview of open source Python packages in the Pangeo (big data geoscience) Machine Learning community will be presented. On read/write, [kvikIO](https://github.com/rapidsai/kvikio) allows low-latency data transfers from Zarr archives via NVIDIA GPU Direct Storage. With tensors loaded in xarray data structures, [xbatcher](https://github.com/xarray-contrib/xbatcher) enables efficient slicing of arrays in an iterative fashion. To connect the pieces, [zen3geo](https://github.com/weiji14/zen3geo) acts as the glue between geospatial libraries - from reading [STAC](https://stacspec.org) items and rasterizing vector geometries to stacking multi-resolution datasets for custom data pipelines. Learn more as the Pangeo community develops tutorials at [Project Pythia](https://cookbooks.projectpythia.org), and join in to hear about the challenges and ideas on scaling machine learning in the geosciences with the [Pangeo ML Working Group](https://pangeo.io/meeting-notes.html#working-group-meetings).

Speakers: Wei Ji Leong