BigQuery GIS performance tips

1. Store Geography objects in the table

  • it avoids the cost of constructing Geography objects at runtime,
  • it allows BigQuery to use spatial index when querying this table.
SELECT * FROM T WHERE
ST_DWITHIN(geo, ST_GeogPoint(122.3, 47.6), 100)
SELECT * FROM T WHERE 
ST_DWITHIN(ST_GeogPoint(lon, lat), ST_GeogPoint(122.3, 47.6), 100)
SELECT * EXCEPT(lat, lon), ST_GeogPoint(lon, lat) AS geog FROM T
SELECT * EXCEPT(wkt) ST_GeogFromText(wkt) AS geog FROM T

2. Spatially cluster tables

3. Maybe combine spatial predicates

SELECT ... FROM T 
WHERE ST_DWITHIN(location, ST_GeogFromText('POINT(123 45)'), 100)
OR ST_DWITHIN(location, ST_GeogFromText('POINT(145 23)'), 100)
OR ...
SELECT ... FROM T 
WHERE ST_DWITHIN(location,
ST_GeogFromText('MULTIPOINT((123 45), (145 23), ...)'),
100)

4. Avoid mixing equality and spatial join conditions

SELECT ... FROM T1 a JOIN T2 b 
ON CAST(a.lat*100 AS INT64) = CAST(b.lat*100 AS INT64)
AND CAST(a.lon*100 AS INT64) = CAST(b.lon*100 AS INT64)
WHERE ST_DWITHIN(a.geo, b.geo, 10)
  • the lat/lon checks would incorrectly filter out some close neighbors,
  • the lat/lon checks prevents BigQuery from using optimized geospatial join. When both equality and geospatial conditions are present, BigQuery executes query using equality join, with post-join evaluation of spatial condition. Geospatial join is usually faster.
SELECT ... FROM T1 a JOIN T2 b
WHERE ST_DWITHIN(a.geo, b.geo, 10)

--

--

--

Hi, I'm TL of BigQuery Geospatial project. Posting small recipes and various notes for BQ Geospatial users.

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Michael Entin

Michael Entin

Hi, I'm TL of BigQuery Geospatial project. Posting small recipes and various notes for BQ Geospatial users.

More from Medium

Load Data from CSV File to Bigquery using Python bigquery client Library.

Overcoming large-scale geospatial analysis using Google BigQuery

Extra geography simplification in BigQuery

Plotting Bar Charts in BigQuery Using a SQL User Defined Function (UDF)