Trying to figure out the following situations for geographic data.  Here is the problem as you can see the point lies within chunk 64, but chunk 60 should be queried as well in this case because within the chunks are data points that contain the resource data and in this situation chunk 60 may have a border point that is a closer match than anything in chunk 64

  • Point in polygon
  • Point (outside of polygon) to edge of polygon distance

At first I tried using some basic math and the haversine formula, but that really is insufficient because it is for point to point and to do that I could find the center point of a polygon target to point and etc.

Then I realized that I am connecting to Postgis database all the time so why not push the functionality to the proven capabilities and speed of Postgis only thing is I have to create a temporary storage of the geo data and then can write some simple SQL statements.

The data is not GeoJSON so first convert the polygon points to WKT and then build a geometry and use that geometry to create a polygon.

ST_MakePolygon(ST_GeomFromText('LINESTRING(102.6757 -12, 88 -12, 88 2.88535, 102.6757 2.88535, 102.6757 -12)', 4326))

Now I have numerous polygons, but they are not in a table, I at first created a temporary table but this seems like overkill and is session bound so why not load the data into a SQL statement with the values option and express it as a WITH clause

Example only showing the first polygon, but making a geometry from WKT and then polygon and select the values into an alias of a table a(location, id)

WITH chunk_bounds AS (SELECT * FROM ( VALUES
(ST_MakePolygon(ST_GeomFromText('LINESTRING(102.6757 -12, 88 -12, 88 2.88535, 102.6757 2.88535, 102.6757 -12)', 4326)), '001') ...
) as a(location, id))

Now we can do all the things we wanted to do.  Now I want to know which chunk a point intersects with (polygons cover entire world so there will always be an intersection).  This query finds the intersecting chunk from above WITH clause for  lat/lng of (37.318161, 137.203674)

SELECT Id
FROM chunk_bounds
WHERE ST_Within(ST_PointFromText('POINT(137.203674 37.318161)', 4326)::geometry, chunk_bounds.location)

Now find all chunks that are touching (they have a distance of 0 is easy way to do this)

SELECT b.id, ST_Distance(a.location,b.location) as distance
FROM chunk_bounds a, chunk_bounds b
WHERE a.id IN (
SELECT Id
FROM chunk_bounds
WHERE ST_Within(ST_PointFromText('POINT(137.203674 37.318161)', 4326)::geometry, chunk_bounds.location)
)
and a.id != b.id
and ST_Distance(a.location, b.location) <= 0;

Ok this works, but the chunks are not of a regular size and a large chunk could possible be touching other chunks that are on the other side of the world we should limit adjacency to some reasonable distance let's say 10 KM from out target location so we don't scan chunks that will not yield anything useful

SELECT b.id, ST_Distance(a.location,b.location) as distance
FROM chunk_bounds a, chunk_bounds b
WHERE a.id IN (
SELECT Id
FROM chunk_bounds
WHERE ST_Within(ST_PointFromText('POINT(137.203674 37.318161)', 4326)::geometry, chunk_bounds.location)
)
and a.id != b.id
and st_distance_sphere(ST_ExteriorRing(b.location), ST_PointFromText('POINT(137.203674 37.318161)', 4326)::geometry) < 10000
and ST_Distance(a.location, b.location) <= 0;

Make sure to turn the polygon into an exterior ring before using the ST_Distance_Sphere function otherwise it will not be finding the distance to the edge of the polygon which is what you want.

Now run this against the Postgis database as a standard select statement and you get your results without having to work through many libraries or writing code for functions that Postgis has tested and made faster than you could ever imagine.

In summary SQL is great and makes things like this easy and possible if you know the capabilities rather than trying to do this in code or use dodgy libraries you have to trust because you're in a hurry.