ST_Transform in BigQuery — Project Any Coordinates to Lng/Lat and Back

Michael Entin
2 min readJul 20, 2023

--

BigQuery only supports WGS84 geodesic coordinates, but my data is in another coordinate system. How can I convert it?

We’ve seen how to project data when loading it to BigQuery, and it is still the best option. But what if the data is already in BigQuery, and we want to project it to WGS84? Thanks to proj4js — a JavaScript port of proj4 library, we can do it using BigQuery UDFs.

First, we need to download the latest release of the library from https://github.com/proj4js/proj4js/releases. The only file we need is dist/proj4.js. I like to version the scripts in UDFs, so let’s rename it to proj4_2_9_0.js , to note we are using version 2.9.0.

Now open Google Cloud Storage, create a new bucket or use an existing one, and upload this file. Record the gs://… URL of the file in GCS.

Let’s now create a UDF. It will need following input arguments:

  • Source projection
  • Destination projection
  • Coordinates in the source projection

It outputs coordinates in the destination projection, as struct with two floating point values. We just reference the JS library (replace your gs bucket URL) and call proj4 function:

CREATE OR REPLACE FUNCTION tmp.proj4(
proj_from STRING, proj_to STRING, x FLOAT64, y FLOAT64)
RETURNS STRUCT<x FLOAT64, y FLOAT64>
LANGUAGE js
OPTIONS (
library=['gs://my-bucket/proj4_2_9_0.js'])
AS r"""
let coords = proj4(proj_from, proj_to, [x,y]);
return { x: coords[0], y: coords[1] };
""";

We can then invoke it to transform coordinates between various projections. E.g. if we have UTM10 data, let’s first copy proj4js definition of the projection on epsg.io site. And we can transform it to longitude/latitude using

DECLARE utm10 STRING DEFAULT
"+proj=utm +zone=10 +datum=WGS84 +units=m +no_defs +type=crs";

WITH numbers AS
(SELECT 565942.74 AS x, 5280440.15 as y)
SELECT x, y, tmp.proj4(utm10, "EPSG:4326", x, y) AS result
FROM numbers;

If you transform a lot of data between a few projections, a bit more performant way to implement this is to define a projection-specific function, instead of generic one. This way we can construct once and reuse a specific transform, and proj4js has to parse the projection definition only once:

CREATE OR REPLACE FUNCTION tmp.fromUtm10(x FLOAT64, y FLOAT64)
RETURNS STRUCT<x FLOAT64, y FLOAT64>
LANGUAGE js
OPTIONS (
library=['gs://my-bucket/proj4_2_9_0.js'])
AS r"""
if (typeof _utm10 == "undefined") {
_utm10 = proj4("+proj=utm +zone=10 +datum=WGS84 +units=m +no_defs +type=crs",
"EPSG:4326");
}
let coords = _utm10.forward([x,y]);
return { x: coords[0], y: coords[1] };
""";

--

--

Michael Entin
Michael Entin

Written by Michael Entin

Hi, I'm TL of BigQuery Geospatial project. Posting small recipes and various notes for BQ Geospatial users.

No responses yet