Parcel calculation #49

New Issue

mpeltriaux · 2021-12-14T12:11:43+01:00

mpeltriaux commented

2021-12-14 12:11:43 +01:00

Status quo

There is no parcel calculation so far (Flurstücksberechnung).

Feature

A new geometry (does not matter for which object) should be intersected with some WFS to generate information about underlying parcels, administrative regions and so on.

Storage

There are two possible ways to store this data.

SQL database

We create a new model called e.g. Parcel which holds the 'Flurstück', 'Flurstückzähler, 'Gemarkung', 'Landkreis', and so on.
We create M2M relations between Parcel and Geometry models. This way we can reach all related parcels of a geometry and for each parcel which geometries are overlaying in there.
This information could change over time, since parcels can change unpredictably over shorter or longer time periods. Therefore it is necessary to recalculate this information from time to time. There are multiple things we could do in here:
1. Add a updated_on timestamp to Parcel, which can be used to keep track on the age of a calculated information and display it in the frontend for users as well.
2. Give the frontend user a button Update parcels or similar, to manually rerun the parcel calculation. This would be a typical 'force' action, not running on a constant time interval.
3. In addition to a force button in the frontend there could be a recalculation once a month on a sunday for all geometries to always hold automatically and monthly calculated parcel data. Therefore we could make the recalculation callable using a command e.g. from a cronjob
4. Since parcels could therefore be overlayed by a geometry now but won't be overlayed on the next run, due to changes in the parcel layout, a sanity command should run in the end of a periodical recalculation, which removes unlinked parcel entries from the database.

No SQL database

We use e.g. redis instead of our model based approach.
Redis entries can have a ttl attribute which defines for how many seconds the entry should exist before it is dropped automatically by redis.
A redis entry (always key:value pairs) could therefore be a Geometry id as key and a json string as value, storing the results of the last parcel calculation.
Each entry could have a ttl of e.g. one month and will be deleted afterwards.
1. If the parcel information is needed e.g. to be displayed in the frontend, the system will take a look into redis and try to find an entry for the current geometry. If there is no such entry, the calculation will be done on the fly and stored on redis for the next time (or until it dies after e.g. a month). If the entry can be found on redis, everything is fine and the data can be used.
A disadvantage of this approach would be very different timestamps for each entry: One has been calculated yesterday, another five minutes ago, the next a second ago and so on. Changes in the officla parcel layout therefore would not show up in all entries at the same time. We would need to 1) wait until each old entry died so it will be recalculated or 2) we need to compare an old calculation with a new one and if they differ, we could force-kill all entries in redis to force recalculation of everything the next time the data is requested.

Conclusion

The sql approach would benefit more, since we can easily use the stored sql data as filter e.g. on the overview filter sections for interventions, compensations and so on. This way of filtering would not be possible (in a decent performance) running on redis json strings.

Calculation

For calculating proper WFS will be used. I just asked for e.g. existing internal WFS which can be used and directly deliver all the needed data at once. Otherwise we would need to run intersections on multiple WFS which leads to more traffic on the network as well as slower performance.

# Status quo There is no parcel calculation so far (Flurstücksberechnung). # Feature A new geometry (does not matter for which object) should be intersected with some WFS to generate information about underlying parcels, administrative regions and so on. ## Storage There are two possible ways to store this data. ### SQL database 1. We create a new model called e.g. `Parcel` which holds the 'Flurstück', 'Flurstückzähler, 'Gemarkung', 'Landkreis', and so on. 1. We create M2M relations between `Parcel` and `Geometry` models. This way we can reach all related parcels of a geometry and for each parcel which geometries are overlaying in there. 1. This information could change over time, since parcels can change unpredictably over shorter or longer time periods. Therefore it is necessary to recalculate this information from time to time. There are multiple things we could do in here: 1. Add a `updated_on` timestamp to `Parcel`, which can be used to keep track on the age of a calculated information and display it in the frontend for users as well. 1. Give the frontend user a button `Update parcels` or similar, to manually rerun the parcel calculation. This would be a typical 'force' action, not running on a constant time interval. 1. In addition to a force button in the frontend there could be a recalculation once a month on a sunday for all geometries to always hold automatically and monthly calculated parcel data. Therefore we could make the recalculation callable using a command e.g. from a cronjob 1. Since parcels could therefore be overlayed by a geometry now but won't be overlayed on the next run, due to changes in the parcel layout, a sanity command should run in the end of a periodical recalculation, which removes unlinked parcel entries from the database. ### No SQL database 1. We use e.g. redis instead of our model based approach. 1. Redis entries can have a ttl attribute which defines for how many seconds the entry should exist before it is dropped automatically by redis. 1. A redis entry (always key:value pairs) could therefore be a `Geometry` id as key and a json string as value, storing the results of the last parcel calculation. 1. Each entry could have a ttl of e.g. one month and will be deleted afterwards. 1. If the parcel information is needed e.g. to be displayed in the frontend, the system will take a look into redis and try to find an entry for the current geometry. If there is no such entry, the calculation will be done on the fly and stored on redis for the next time (or until it dies after e.g. a month). If the entry can be found on redis, everything is fine and the data can be used. 1. A disadvantage of this approach would be very different timestamps for each entry: One has been calculated yesterday, another five minutes ago, the next a second ago and so on. Changes in the officla parcel layout therefore would not show up in all entries at the same time. We would need to 1) wait until each old entry died so it will be recalculated or 2) we need to compare an old calculation with a new one and if they differ, we could force-kill all entries in redis to force recalculation of everything the next time the data is requested. ### Conclusion The sql approach would benefit more, since we can easily use the stored sql data as filter e.g. on the overview filter sections for interventions, compensations and so on. This way of filtering would not be possible (in a decent performance) running on redis json strings. ## Calculation For calculating proper WFS will be used. I just asked for e.g. existing internal WFS which can be used and directly deliver all the needed data at once. Otherwise we would need to run intersections on multiple WFS which leads to more traffic on the network as well as slower performance.

mpeltriaux added the

feature

label 2021-12-14 12:11:43 +01:00

mpeltriaux self-assigned this 2021-12-14 12:11:43 +01:00

mpeltriaux commented

2021-12-16 12:18:42 +01:00

Parcel and District

To avoid redundant data inside of the above described Parcel model, we introduce another model called District

District

Holds data on 'Kreis', 'Verbandsgemeinde' and 'Gemeinde' as well as a M2M relation to Geometry. This way we reduce the amount of data stored in Parcel:

A geometry can have a relation to a hundred parcel entries but all parcels are located in the same district. Therefore each parcel entry does not need to hold the data on the above named fields again and again but instead holds a foreign key on a District entry.

# Parcel and District To avoid redundant data inside of the above described `Parcel` model, we introduce another model called `District` ## District Holds data on 'Kreis', 'Verbandsgemeinde' and 'Gemeinde' as well as a M2M relation to `Geometry`. This way we reduce the amount of data stored in `Parcel`: A geometry can have a relation to a hundred parcel entries but all parcels are located in the same district. Therefore each parcel entry does not need to hold the data on the above named fields again and again but instead holds a foreign key on a `District` entry.

mpeltriaux referenced this issue from a commit

2021-12-17 17:30:34 +01:00

#49 Parcels and Districts

mpeltriaux referenced this issue from a commit

2021-12-17 17:30:34 +01:00

#49 Parcels and Districts

mpeltriaux referenced this issue from a commit

2022-01-04 16:52:15 +01:00

#49 Parcels and Districts

mpeltriaux referenced this issue from a commit

2022-01-04 16:52:15 +01:00

#49 Calculation implementation

mpeltriaux referenced this issue from a commit

2022-01-04 16:52:15 +01:00

#49 Extends sanitize db command

mpeltriaux referenced this issue from a commit

2022-01-04 16:52:15 +01:00

#49 Update all parcels command

mpeltriaux commented

2022-01-05 15:14:36 +01:00

Finished parcels

Parcels are now being updated each time the general form of a dataset (intervention, compensation, eco-account or ema) is submitted.

Parcel update command

Using update_all_parcels all non-empty geometries from the database will be processed and all parcels will be recalculated.

Example

$ python manage.py update_all_parcels

This command can be used e.g. with cronjob for force-updating all parcels once a month.

Update runtime

Due to the network limitations of the official WFS based approach, another issue has been opened to enhance the UX: #55

# Finished parcels ![grafik](/attachments/93a3e1f6-b6eb-43c7-80eb-fcd70443f258) Parcels are now being updated each time the general form of a dataset (intervention, compensation, eco-account or ema) is submitted. ## Parcel update command Using `update_all_parcels` all non-empty geometries from the database will be processed and all parcels will be recalculated. Example ```bash $ python manage.py update_all_parcels ``` This command can be used e.g. with cronjob for force-updating all parcels once a month. # Update runtime Due to the network limitations of the official WFS based approach, another issue has been opened to enhance the UX: #55

grafik.png

74 KiB

mpeltriaux referenced this issue from a commit

2022-01-05 15:26:42 +01:00

#49 Frontend rendering

mpeltriaux referenced this issue from a commit

2022-01-05 15:26:42 +01:00

#49 Parcels on report

mpeltriaux referenced this issue from a commit

2022-01-05 15:26:42 +01:00

#49 Annual report improve

mpeltriaux referenced this issue