Michael Entin
May 30, 2023

--

Clustering is more about filtering data. It distributes large table (more than around 0.1 to 1GB) into shards with similar values of clustering column. BigQuery also computes spatial extend of each shard, and when you query such a table with a spatial filter - distant shards can be eliminated based on this metadata alone, without reading data. This saves query cost and improves performance. It currently does not help much with point in polygon queries, unless the query filters some data.

How big is your point cloud, and how do you aggregate it?

--

--

Michael Entin
Michael Entin

Written by Michael Entin

Hi, I'm TL of BigQuery Geospatial project. Posting small recipes and various notes for BQ Geospatial users.

Responses (1)