Nearest neighbor in BigQuery GIS

SELECT 
a.id,
ARRAY_AGG(b.id ORDER BY ST_Distance(a.geog, b.geog) LIMIT 1)
[ORDINAL(1)] as neighbor_id

FROM people_table a JOIN restaurant_table b
GROUP BY a.id
SELECT 
a.id,
ARRAY_AGG(b.id ORDER BY ST_Distance(a.geog, b.geog) LIMIT 1)
[ORDINAL(1)] as neighbor_id
FROM people_table a JOIN restaurant_table b
ON ST_DWithin(a.geog, b.geom, 123.45) -- 123.45 is search radius
GROUP BY a.id
  • If you need more fields from table b, replace expression b.id inside ARRAY_AGG with STRUCT(b.id, b.extra_field, b.one_more_field)
  • If you need more fields from table a, especially in case of GEOGRAPHY that you cannot use as aggregation key, use ANY_VALUE aggregation function:
    SELECT ... , ANY_VALUE(a.geog) ... GROUP BY a.id
WITH missed_people_table AS (
SELECT a.* FROM people_table a
WHERE NOT EXISTS(
SELECT 1
FROM people_with_neighbors n -- result of first SELECT
WHERE n.id = a.id)
SELECT
... -- the same query as above only
-- 1) using missed_people_table
-- 2) larger search radius

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Michael Entin

Michael Entin

151 Followers

Hi, I'm TL of BigQuery Geospatial project. Posting small recipes and various notes for BQ Geospatial users.