Had to do a ETL task of creating a set of points in 4326 lat/lng projection to compare against the current table data in the same 4326 lat/lng projection and find all points that are unique and not within a certain distance from the current set of points.

In essence our current set of data is most likely incomplete and need to merge any missing data from a separate listed input that needs to be transformed into the same spatial geometry with a minimum distance difference to be considered unique and missing from the set of data.

What I did

  • Create a temp table
DROP TABLE if exists temp_usa_utility;
CREATE TEMP TABLE if not exists temp_usa_utility (name varchar(100) not null,geom geometry,capacity double precision not null, tags text[] );
  • Load the data into the temp table in consistent projection (sample only, much bigger in actuality)
insert into temp_usa_utility(geom, capacity, name, tags) values
 (ST_PointFromText('POINT(-98.44472200042 29.271667000132)', 4326)::geometry,40.7,'OCI Alamo Solar I','{Open Data Catalog,Batteries,efe28ece-e602-4c06-a3f1-5dd1be254d71}'),
 (ST_PointFromText('POINT(-121.228379999837 44.0680299997847)', 4326)::geometry,9.9,'NorWest Energy 2, LLC','{Open Data Catalog,71ff70c3-7191-44ec-a85b-17d5fcaec98c}'),
 (ST_PointFromText('POINT(-114.975374999656 35.7822199998532)', 4326)::geometry,93.6,'Copper Mountain Solar 4, LLC','{Open Data Catalog,09a39526-1020-4033-8680-53ea75ec5b24}');
  • Query with temp table against current database table
select t1.*
from (
       select t.name, min(st_distance(t.geom, sf.geom)) min_distance
       from temp_usa_utility t,
            target_table sf
       where sf.geom IS NOT NULL
       group by t.name
     ) a,
     temp_usa_utility t1
where a.min_distance > 0.05
and t1.name = a.name
order by capacity desc;
  • Add the missing data into the current database table
insert into target_table(name, latitude, longitude, geom)
select t1.name, ST_Y(t1.geom), ST_X(t1.geom), geom
from (
       select t.name, min(st_distance(t.geom, sf.geom)) min_distance
       from temp_usa_utility t,
            target_table sf
       where sf.geom IS NOT NULL
       group by t.name
     ) a,
     temp_usa_utility t1
where a.min_distance > 0.05
and t1.name = a.name
order by capacity desc;