GeoParquet
Purpose
Describe how GeoParquet supports scalable vector data storage, analytics, and interchange in modern geospatial data systems.
Outline
- Relationship between Parquet columnar storage and geospatial vector workloads
- Geometry encoding, coordinate reference systems, and metadata expectations
- Partitioning and indexing strategies for spatial and temporal queries
- Integration patterns with data lakes, query engines, and ML feature generation
- Tradeoffs compared with GeoJSON, Shapefiles, spatial databases, and tiled vector formats
Later Examples
- Designing a partitioned GeoParquet dataset for repeated analysis
- Reading only required columns and spatial subsets
- Preparing vector features for a geospatial ML pipeline