Skip to content

Commit 888d18f

Browse files
authored
Merge pull request #86 from minus34/202408
202408
2 parents 5bacabb + 5315706 commit 888d18f

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

41 files changed

+167
-166
lines changed

README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -51,8 +51,8 @@ The behaviour of gnaf-loader can be controlled by specifying various command lin
5151

5252
#### Optional Arguments
5353
* `--srid` Sets the coordinate system of the input data. Valid values are `4283` (the default: GDA94 lat/long) and `7844` (GDA2020 lat/long).
54-
* `--geoscape-version` Geoscape version number in YYYYMM format. Defaults to current year and last release month. e.g. `202405`.
55-
* `--previous-geoscape-version` Previous Geoscape release version number as YYYYMM; used for QA comparison. e.g. `202402`.
54+
* `--geoscape-version` Geoscape version number in YYYYMM format. Defaults to current year and last release month. e.g. `202408`.
55+
* `--previous-geoscape-version` Previous Geoscape release version number as YYYYMM; used for QA comparison. e.g. `202405`.
5656
* `--raw-gnaf-schema` schema name to store raw GNAF tables in. Defaults to `raw_gnaf_<geoscape_version>`.
5757
* `--raw-admin-schema` schema name to store raw admin boundary tables in. Defaults to `raw_admin_bdys_<geoscape_version>`.
5858
* `--gnaf-schema` destination schema name to store final GNAF tables in. Defaults to `gnaf_<geoscape_version>`.
@@ -69,7 +69,7 @@ The behaviour of gnaf-loader can be controlled by specifying various command lin
6969
* `--no-boundary-tag` DO NOT tag all addresses with some of the key admin boundary IDs for creating aggregates and choropleth maps.
7070

7171
### Example Command Line Arguments
72-
* Local Postgres server: `python load-gnaf.py --gnaf-tables-path="C:\temp\geoscape_202405\G-NAF" --admin-bdys-path="C:\temp\geoscape_202405\Administrative Boundaries"` Loads the GNAF tables to a Postgres server running locally. GNAF archives have been extracted to the folder `C:\temp\geoscape_202405\G-NAF`, and admin boundaries have been extracted to the `C:\temp\geoscape_202405\Administrative Boundaries` folder.
72+
* Local Postgres server: `python load-gnaf.py --gnaf-tables-path="C:\temp\geoscape_202408\G-NAF" --admin-bdys-path="C:\temp\geoscape_202408\Administrative Boundaries"` Loads the GNAF tables to a Postgres server running locally. GNAF archives have been extracted to the folder `C:\temp\geoscape_202408\G-NAF`, and admin boundaries have been extracted to the `C:\temp\geoscape_202408\Administrative Boundaries` folder.
7373
* Remote Postgres server: `python load-gnaf.py --gnaf-tables-path="\\svr\shared\gnaf" --local-server-dir="f:\shared\gnaf" --admin-bdys-path="c:\temp\unzipped\AdminBounds_ESRI"` Loads the GNAF tables which have been extracted to the shared folder `\\svr\shared\gnaf`. This shared folder corresponds to the local `f:\shared\gnaf` folder on the Postgres server. Admin boundaries have been extracted to the `c:\temp\unzipped\AdminBounds_ESRI` folder.
7474
* Loading only selected states: `python load-gnaf.py --states VIC TAS NT ...` Loads only the data for Victoria, Tasmania and Northern Territory
7575

@@ -117,8 +117,8 @@ Should take 15-60 minutes.
117117
- A knowledge of [Postgres pg_restore parameters](https://www.postgresql.org/docs/14/app-pgrestore.html)
118118

119119
### Process
120-
1. Download the [GNAF dump file](https://minus34.com/opendata/geoscape-202405/gnaf-202405.dmp) or [GNAF GDA2020 dump file](https://minus34.com/opendata/geoscape-202405-gda2020/gnaf-202405.dmp) (~2.0Gb)
121-
2. Download the [Admin Bdys dump file](https://minus34.com/opendata/geoscape-202405/admin-bdys-202405.dmp) or [Admin Bdys GDA2020 dump file](https://minus34.com/opendata/geoscape-202405-gda2020/admin-bdys-202405.dmp) (~2.8Gb)
120+
1. Download the [GNAF dump file](https://minus34.com/opendata/geoscape-202408/gnaf-202408.dmp) or [GNAF GDA2020 dump file](https://minus34.com/opendata/geoscape-202408-gda2020/gnaf-202408.dmp) (~2.0Gb)
121+
2. Download the [Admin Bdys dump file](https://minus34.com/opendata/geoscape-202408/admin-bdys-202408.dmp) or [Admin Bdys GDA2020 dump file](https://minus34.com/opendata/geoscape-202408-gda2020/admin-bdys-202408.dmp) (~2.8Gb)
122122
3. Edit the _restore-gnaf-admin-bdys.bat_ or _.sh_ script in the supporting-files folder for your dump file names, database parameters and for the location of pg_restore
123123
5. Run the script, come back in 15-60 minutes and enjoy!
124124

@@ -127,11 +127,11 @@ Geoparquet versions of the spatial tables, as well as parquet versions of the no
127127

128128
Geometries have WGS84 lat/long coordinates (SRID/EPSG:4326). A sample query for analysing the data using [Apache Sedona](https://sedona.apache.org/), the spatial extension to [Apache Spark](https://spark.apache.org/) is in the `spark` folder.
129129

130-
The files are here: `s3://minus34.com/opendata/geoscape-202405/geoparquet/`
130+
The files are here: `s3://minus34.com/opendata/geoscape-202408/geoparquet/`
131131

132132
### AWS CLI Examples:
133-
- List all datasets: `aws s3 ls s3://minus34.com/opendata/geoscape-202405/geoparquet/`
134-
- Copy all datasets: `aws s3 sync s3://minus34.com/opendata/geoscape-202405/geoparquet/ <my-local-folder>`
133+
- List all datasets: `aws s3 ls s3://minus34.com/opendata/geoscape-202408/geoparquet/`
134+
- Copy all datasets: `aws s3 sync s3://minus34.com/opendata/geoscape-202408/geoparquet/ <my-local-folder>`
135135

136136
## DATA LICENSES
137137

docker/Dockerfile

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
FROM debian:bookworm-slim
22

33
# replaced the downloading of the Potgres dump files to use local files instead (for performance)
4-
# ARG BASE_URL="https://minus34.com/opendata/geoscape-202405"
4+
# ARG BASE_URL="https://minus34.com/opendata/geoscape-202408"
55
# ENV BASE_URL ${BASE_URL}
66

77
# Postgres user password - WARNING: change this to something a lot more secure
@@ -33,23 +33,23 @@ RUN echo "listen_addresses='*'" >> /etc/postgresql/15/main/postgresql.conf
3333
RUN mkdir -p /data
3434
WORKDIR /data
3535

36-
ADD gnaf-202405.dmp .
37-
ADD admin-bdys-202405.dmp .
36+
ADD gnaf-202408.dmp .
37+
ADD admin-bdys-202408.dmp .
3838

3939
# replace the add statements above if wanting to download Postgres dump files
4040
# RUN /data \
41-
# && wget --quiet ${BASE_URL}/gnaf-202405.dmp \
42-
# && wget --quiet ${BASE_URL}/admin-bdys-202405.dmp
41+
# && wget --quiet ${BASE_URL}/gnaf-202408.dmp \
42+
# && wget --quiet ${BASE_URL}/admin-bdys-202408.dmp
4343

4444
RUN /etc/init.d/postgresql start \
45-
&& pg_restore -Fc -d postgres -h localhost -p 5432 -U postgres /data/gnaf-202405.dmp \
45+
&& pg_restore -Fc -d postgres -h localhost -p 5432 -U postgres /data/gnaf-202408.dmp \
4646
&& /etc/init.d/postgresql stop \
47-
&& rm /data/gnaf-202405.dmp
47+
&& rm /data/gnaf-202408.dmp
4848

4949
RUN /etc/init.d/postgresql start \
50-
&& pg_restore -Fc -d postgres -h localhost -p 5432 -U postgres /data/admin-bdys-202405.dmp \
50+
&& pg_restore -Fc -d postgres -h localhost -p 5432 -U postgres /data/admin-bdys-202408.dmp \
5151
&& /etc/init.d/postgresql stop \
52-
&& rm /data/admin-bdys-202405.dmp
52+
&& rm /data/admin-bdys-202408.dmp
5353

5454
EXPOSE 5432
5555

docker/xx_code_snippets.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
cd /Users/$(whoami)/git/minus34/gnaf-loader/docker
33

44
# build gnaf loader image
5-
docker build --squash --tag minus34/gnafloader:latest --tag minus34/gnafloader:202405 .
5+
docker build --squash --tag minus34/gnafloader:latest --tag minus34/gnafloader:202408 .
66

77
# run gnaf loader container
88
docker run --name=gnafloader --publish=5433:5432 minus34/gnafloader:latest

load-gnaf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -214,7 +214,7 @@ def populate_raw_gnaf(pg_cur):
214214
# load all PSV files using multiprocessing
215215
geoscape.multiprocess_list("sql", sql_list, logger)
216216

217-
# fix missing geocodes (added due to missing data in 202405 release)
217+
# fix missing geocodes (added due to missing data in 202408 release)
218218
sql = geoscape.open_sql_file("01-04-raw-gnaf-fix-missing-geocodes.sql")
219219
pg_cur.execute(sql)
220220

postgres-scripts/01-04-raw-gnaf-fix-missing-geocodes.sql

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
-- workaround for missing default coordinates - 202405 release issue
1+
-- workaround for missing default coordinates - 202408 release issue
22
with missing as (
33
select address_detail_pid
44
from raw_gnaf.address_default_geocode

postgres-scripts/02-02a-prep-admin-bdys-tables.sql

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -203,10 +203,10 @@ UPDATE admin_bdys.locality_bdys
203203
;
204204

205205

206-
-- -- add old locality_pids to unedited localities -- need to rollover old locality pids from GNAF 202405 release - not supplied in 202405 release
206+
-- -- add old locality_pids to unedited localities -- need to rollover old locality pids from GNAF 202408 release - not supplied in 202408 release
207207
-- UPDATE admin_bdys.locality_bdys as new
208208
-- SET old_locality_pid = old.old_locality_pid
209-
-- FROM admin_bdys_202405.locality_bdys AS old
209+
-- FROM admin_bdys_202408.locality_bdys AS old
210210
-- WHERE new.locality_pid = old.locality_pid;
211211

212212

postgres-scripts/xx-04-02-manual-bdy-tags.sql

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55

66
-- fix 35 boatsheds
7-
update gnaf_202405.address_principal_admin_boundaries
7+
update gnaf_202408.address_principal_admin_boundaries
88
set lga_pid = 'lgacbffb11990f2',
99
lga_name = 'Hobart City'
1010
where locality_pid = 'loc0f7a581b85b7'

postgres-scripts/xx-add-elevation-to-gnaf.sql

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ DROP TABLE IF EXISTS temp_gnaf_100m_points;
4343
--
4444
-- SELECT ST_Value(dem.rast, gnaf.geom) as elevation,
4545
-- *
46-
-- FROM gnaf_202405.address_principals as gnaf
47-
-- INNER JOIN gnaf_202405.srtm_3s_dem as dem on ST_Intersects(gnaf.geom, dem.rast) limit 100;
46+
-- FROM gnaf_202408.address_principals as gnaf
47+
-- INNER JOIN gnaf_202408.srtm_3s_dem as dem on ST_Intersects(gnaf.geom, dem.rast) limit 100;
4848

4949

postgres-scripts/xx-alias-principals-with-different-coordinates.sql

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,9 @@ SELECT als.gnaf_pid, als.street_locality_pid, als.locality_pid, als.alias_princi
66
ST_MakePoint(als.longitude, als.latitude)::geography,
77
ST_MakePoint(gnaf.longitude, gnaf.latitude)::geography
88
) as distance
9-
FROM gnaf_202405.address_aliases as als
10-
INNER JOIN gnaf_202405.address_alias_lookup as lkp on als.gnaf_pid = lkp.alias_pid
11-
INNER JOIN gnaf_202405.address_principals as gnaf on lkp.principal_pid = gnaf.gnaf_pid
9+
FROM gnaf_202408.address_aliases as als
10+
INNER JOIN gnaf_202408.address_alias_lookup as lkp on als.gnaf_pid = lkp.alias_pid
11+
INNER JOIN gnaf_202408.address_principals as gnaf on lkp.principal_pid = gnaf.gnaf_pid
1212
WHERE als.latitude <> gnaf.latitude
1313
OR als.longitude <> als.longitude
1414
order by ST_Distance(

postgres-scripts/xx-export-address-principals-to-csv.sql

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,5 +6,5 @@ COPY (
66
address, locality_name, postcode, state, locality_postcode, confidence,
77
legal_parcel_id, mb_2016_code, mb_2021_code, latitude, longitude,
88
geocode_type, reliability
9-
FROM gnaf_202405.address_principals
9+
FROM gnaf_202408.address_principals
1010
) TO '/Users/hugh.saalmans/tmp/address_principals.psv' HEADER CSV;

0 commit comments

Comments
 (0)