Tutorial

Getting Started with Spatial SQL using PostGIS

This tutorial provides a brief overview of SQL and spatial SQL. It was designed for those who have little or no experience with SQL so it starts with basic queries and progresses to more complex queries that include spatial functions from PostGIS. The tutorial at the end presents a list of questions that you will use to write your own SQL queries.

All the queries in this lesson can be run in QGIS Browser panel Execute SQL tool. The results from the spatial and non-spatial queries can be loaded into QGIS as “query layers”.

Run the following queries to familiarize yourself with SQL. It should be noted that you may need to change schema names and/or table names if they differ from how the queries are written below. 

What You’ll Learn in this Tutorial

  • How to load multiple layers into PostgreSQL using a batch operation in QGIS.
  • How to execute SQL queries from the Browser panel in QGIS.
  • How to write basic SQL queries.
  • How to write basic spatial SQL queries.

Prerequisites

  • Install the latest QGIS LTR here.
  • Download and unzip the tutorial data here.
  • Access to a PostgreSQL/PostGIS database with permissions to create schemas and tables.
  • Load the tutorial data in QGIS and upload them into a schema called spatial_sql_tutorial (see exact instructions below).

1. Load Tutorial Data into PostgreSQL

Instructions for loading data into PostgreSQL can be found here. In order to execute the statements below, load the data as follows into a training database.

1. Create a schema in the training database called spatial_sql_tutorial.

2. Unzip the tutorial data and load all the layers in the Geopackage into QGIS.

3. Use the Export to PostgreSQL (available connections) in batch mode to upload all layers to the spatial_sql_tutorial schema. Set the following parameters:

  • Database: select your training database
  • Input layer: select all 10 layers from the geopackage (make sure you do not select duplicates).
  • Schema: select spatial_sql_tutorial for all.
  • Promote to multipart: No for all. We will need some single part geometries for the querties below.

Instructions for loading data into PostgreSQL can be found here.

4. The spatial_sql_tutorial schema should now have 10 layers within it.

2. Executing SQL Statements using the Browser Panel

There are many ways to execute SQL queries, both within QGIS and using tools like pgAdmin4 or even the command line interface tool called PSQL. In this tutorial, it is recommended to execute SQL queries using the QGIS Browser panel Execute SQL tool. The following explains how to use this tool:

1. In the Browser panel, expand a database.

2. Right-click on the database and select Execute SQL. The tool will open in a separate window.

3. Type or paste the SQL statement in the top panel then click the Execute button. The result will appear in the bottom panel OR an error will appear if the statement is invalid.

4. To load the results as a new layer, click “Load as a new layer” the set the following:

  • Column(s) with unique values: pick the primary key field (e.g. id).
  • Geometry column: if the result includes a geometry field, identify it here.
  • Layer name: Optionally, name the output layer.

Click Load Layer and the query layer will load in QGIS.

3. SELECT and FROM Clauses

The SELECT clause tells the DB which columns we want to return in a query. The FROM clause tells the database which table to query. 

3.1 To select all columns from the neighbourhoods table, we do this:

SELECT *
FROM spatial_sql_tutorial.neighbourhoods;

3.2 Or, if we just want the name of every neighbourhood:

SELECT name
FROM spatial_sql_tutorial.neighbourhoods;

3.3 Or, if we just want the name of every neighbourhood in alphabetical order:

SELECT name
FROM spatial_sql_tutorial.neighbourhoods
ORDER BY name;

3.4 To get specific columns, use a comma to separate them:

SELECT name, shape_area
FROM spatial_sql_tutorial.neighbourhoods
ORDER BY name;

4. Where Clause

The WHERE clause filters the rows based on a defined rule. For example, we might select only colours that are red or animals that are dogs. 

4.1 Select every neighbourhood that is named Rose Hill:

SELECT name
FROM spatial_sql_tutorial.neighbourhoods
WHERE name = 'ROSE HILL';

4.2 Or select every neighbourhood that IS NOT named Rose Hill:

SELECT name
FROM spatial_sql_tutorial.neighbourhoods
WHERE name != 'ROSE HILL';

5. Counting Rows

Sometimes we don’t want to return the list of rows, we just want to return the total number of rows. In this case, we can use the aggregate function count(*) in PostgreSQL to count the number of occurrences. 

5.1 How many neighbourhoods are there in Kamloops?

SELECT count(*)
FROM spatial_sql_tutorial.neighbourhoods;

5.2 How many parcels are there in Kamloops?

SELECT count(*)
FROM spatial_sql_tutorial.property_parcels;

5.3 How many city buildings are there in Kamloops?

SELECT count(*)
FROM spatial_sql_tutorial.city_building;

6. Summing Rows

The aggregate function sum() will sum a specific field. For example, if you want to sum a field called area, you can put sum(area) in the SELECT clause. If, however, you list more than one item in the SELECT clause, you must include the GROUP BY clause at the end of the query. 

6.1 A simple sum query to calculate the total area of all parks in Kamloops.

SELECT sum(shape_area)
FROM spatial_sql_tutorial.parks;

6.2 To get the park areas based on the sitetype field:

SELECT sitetype, sum(shape_area)
FROM spatial_sql_tutorial.parks
GROUP BY sitetype;

The following are some basic spatial queries for overlay analysis, measuring geometries, transforming geometries, spatial joins, proximity analysis, and various spatial output formats. Run these queries in your favourite query tool and make changes to change the results. It is important to spend time interacting with this material so you can use it effectively in your enterprise environment.

7. Buffer Analysis

Show a 500m buffer around water reservoirs.

SELECT id, ST_Buffer(geom,500) as geom
FROM spatial_sql_tutorial.water_reservoir;

Result

8. Buffer Intersect Analysis

Which parcels intersect a 500m buffer around each water reservoir?

SELECT a.*
FROM spatial_sql_tutorial.property_parcels a, spatial_sql_tutorial.water_reservoir b
WHERE st_intersects(a.geom,st_buffer(b.geom,500));

Result

9. Measuring Geometries

9.1 What is the total area of all parcels that intersect with the 500m buffer around water reservoirs? 

SELECT sum(st_area(a.geom))
FROM spatial_sql_tutorial.property_parcels a, spatial_sql_tutorial.water_reservoir b
WHERE st_intersects(a.geom,st_buffer(b.geom,500));

Result: 124831253.48186931

9.2 We can also use the round() function to reduce the number of decimal places, but the data type needs to be changed to numeric – a process called casting in RDBMSs. 

SELECT round(sum(st_area(a.geom))::numeric,2) as area_sqm
FROM spatial_sql_tutorial.property_parcels a, spatial_sql_tutorial.water_reservoir b
WHERE st_intersects(a.geom,st_buffer(b.geom,500));

Result: 124831253.48

9.3 What is the area of Waterfront Park (rounded to two decimal places)?

SELECT round(st_area(geom)::numeric)
FROM spatial_sql_tutorial.parks
WHERE name = 'WATERFRONT PARK';

Result: 31015.0

9.4 What is the perimeter of Waterfront Park (rounded to two decimal places)?

SELECT round(st_perimeter(geom)::numeric)
FROM spatial_sql_tutorial.parks
WHERE name = 'WATERFRONT PARK';

Result: 1449.0

9.5 Which park has the longest perimeter and what is it (rounded to two decimal places)?

SELECT name, round(st_perimeter(geom)::numeric)
FROM spatial_sql_tutorial.parks
ORDER BY st_perimeter(geom) DESC
LIMIT 1;

Result: KENNA CARTWRIGHT NATURE PARK | 27144.0

9.6 What is the total length (in km) for all roads in Kamloops by road class? 

SELECT classifica as road_class, round((sum(st_length(geom))/1000)::numeric,2) total_length_km
FROM spatial_sql_tutorial.street
GROUP BY classifica
ORDER BY classifica;

Result

10. Transforming Geometries

It is easy to transform geometries in PostGIS. There are a number of functions available to transform and extract information about geometries. In this section, we will discuss four common types of functions, namely:

  • ST_Transform. This reprojects a geometry from one CRS to another using the EPSG code. 
  • ST_X & ST_Y. This extracts the X and Y coordinates of points, respectively.
  • ST_Centroid. This function gets the centroid of a geometry as a point.

10.1 Transform the city_boundary layer to Pseudo-Mercator (EPSG:3857).

SELECT id,st_transform(geom,3857) as geom,name
FROM spatial_sql_tutorial.city_boundary;

Result: load the result as a query layer and examine properties to see CRS.

10.2 What are the X and Y coordinates of all tourism accommodation locations in Kamloops?

SELECT 	id, 
	name, 
	round(st_x(geom)::numeric,2) as x,
	round(st_y(geom)::numeric,2) as y
FROM spatial_sql_tutorial.tourism_accommodation
ORDER BY name;

Result

10.3 What are the X and Y coordinates in EPSG:3857 of all tourism accommodation locations in Kamloops?

SELECT 	id, 
	name, 
	round(st_x(st_transform(geom,3857))::numeric,2) as x,
	round(st_y(st_transform(geom,3857))::numeric,2) as y
FROM spatial_sql_tutorial.tourism_accommodation
ORDER BY name;

Result

10.4 What are the centroids of every city building in Kamloops?

SELECT 	id, 
	facilityid, 
	st_centroid(geom) as geom
FROM spatial_sql_tutorial.city_building
ORDER BY facilityid;

Result

11. Spatial Joins

Get the address of all warehouses in Kamloops by joining OSM building data with property parcel data. 

SELECT a.id, a.geom, a.name, b.address
FROM spatial_sql_tutorial.buildings AS a
LEFT JOIN spatial_sql_tutorial.property_parcels AS b
ON st_intersects(st_centroid(a.geom), b.geom)
WHERE a.building = 'warehouse'
ORDER BY b.address;

Results

12. Proximity Analysis

12.1 How far is each parcel from the nearest healthcare centre in Kamloops?

SELECT DISTINCT ON (a.id) 
	a.*, b.name,
	st_distance(a.geom,b.geom) as distance_m
FROM spatial_sql_tutorial.property_parcels AS a
LEFT JOIN spatial_sql_tutorial.health_care_centres b
ON st_dwithin(a.geom,b.geom,10000)
ORDER BY a.id, distance_m;

Result

12.2 What is the parcel catchment for each provincially-run clinic in Kamloops? Assume that residents will go to the closest clinic. 

WITH distance_table as (
	SELECT DISTINCT ON (a.id) 
		a.*, 
		b.name,
		b.classifica,
		b.operator,
		st_distance(a.geom,b.geom) as distance_m
	FROM spatial_sql_tutorial.property_parcels AS a
	LEFT JOIN spatial_sql_tutorial.health_care_centres b
		ON st_dwithin(a.geom,b.geom,100000)
	WHERE b.operator = 'PROVINCE' AND b.classifica = 'CLINIC' AND a.classifica = 'PARCEL'
	ORDER BY a.id, distance_m
)
SELECT name, count(name) AS parcels
FROM distance_table
GROUP BY name
ORDER BY parcels desc;

Results

12.3 What percentage of parcels are outside a 500m radius from all provincially-run clinics in Kamloops? 

WITH service_area_500m as (
	SELECT ST_Union(st_buffer(geom,500)) geom
	FROM spatial_sql_tutorial.health_care_centres
	WHERE	classifica = 'CLINIC'
)
SELECT round((count(a.*))::numeric / (SELECT count(*) 
FROM spatial_sql_tutorial.property_parcels WHERE "CLASSIFICA" = 'PARCEL')::numeric * 100,1) underserved_parcels
FROM spatial_sql_tutorial.property_parcels AS a, service_area_500m b
WHERE st_disjoint(a.geom,b.geom) AND a."CLASSIFICA" = 'PARCEL';

Result: 84.4%

13. Spatial Output Functions

The PostGIS extension also includes a number of spatial output functions to export geometries. For example, the following statement shows how to export geometries to Well-Known Text (WKT), Keyhole Markup Language (KML), and GeoJSON.

13.1 Export Geometries as WKT or EWKT

SELECT name, 
st_asText(geom) AS wkt, 
st_asewkt(geom) AS ewkt
FROM spatial_sql_tutorial.city_boundary;

Results

13.2 Export Geometries as KML

SELECT name, 
 st_asKML(geom) AS kml
 FROM spatial_sql_tutorial.city_boundary;

Results

13.3 Export Geometries as GeoJSON

SELECT name, 
st_asgeojson(geom) AS geojson
FROM spatial_sql_tutorial.city_boundary;

Results

Tutorial: Ask big questions with Spatial SQL

The best way to learn SQL is to do SQL and do a lot of SQL. The following are some queries that you can write and execute on the Kamloops data in your database. An answer sheet is also provided, but try to solve these problems on your own before consulting the answers. 

Basic Spatial SQL Queries

1. How many parcels have an area larger than 5000 m2?

2. How many parcels are within 500m of parks?

3. How many parcels are in the North Kamloops neighbourhood? 

4. What is the total length (in meters) of all streets rounded to two decimal places?

5. What is the total perimeter (in meters) of Riverside Park? Round to two decimal places.

(Scroll down to see answers)

Answers

Question 1: How many parcels have an area larger than 5000 m2?

SELECT count(*)
FROM spatial_sql_tutorial.property_parcels
WHERE st_area(geom) > 5000;

Result: 2687

Question 2: How many parcels are within 500m of parks?

WITH buffer_area as (
	SELECT st_union(st_buffer(geom,500)) geom
	FROM spatial_sql_tutorial.riparian_development_permit_area
)
SELECT count(a.*)
FROM spatial_sql_tutorial.property_parcels a, buffer_area b
WHERE st_intersects(a.geom,b.geom);

Result: 27438

Question 3: How many parcels are in the North Kamloops neighbourhood? 

SELECT count(a.*)
FROM spatial_sql_tutorial.property_parcels a, spatial_sql_tutorial.neighbourhoods b
WHERE st_intersects(a.geom,b.geom) AND b.name = 'NORTH KAMLOOPS';

Result: 3695

Question 4: What is the total length (in meters) of all streets rounded to two decimal places?

SELECT round(sum(st_length(geom))::numeric,2)
FROM spatial_sql_tutorial.street;

Result: 685920.21m

Question 5: What is the total perimeter (in meters) of Riverside Park? Round to two decimal places. 

SELECT round(st_perimeter(geom)::numeric,2) as perimeter_m
FROM spatial_sql_tutorial.parks
WHERE name = 'RIVERSIDE PARK';

Result: 1872.1m

How can we help?

Contact us today to schedule a free consultation with a member of our team.