1

Re: How to deal with fuzzines of location data

Hi,

we are wondering how to handle the fuzzynes of sample locations. In our old database we allow users to set a fuzzynes value when they enter data. In indicia we use a sample attribute for this value.
The problem is when querying data by a polygon: samples near the polygon with high fuzzynes are not selected, even they have a fuzzynes which makes it quite likely that they are in the polygon. We have tried succesfully to manipulate the database to save a "circle" as the geometry with the fuzzynes value as the radius.
This worked fine for the queries. Is there a better way to handle this? Because manually manipulating the database is not realy an option.

Regards

Daniel

2

Re: How to deal with fuzzines of location data

Hi

I am using Indicia with a national grid reference system and sample locations are stored as squares with a size of the square indicating the accuracy of the record. When a user indicates the sample location on the map a web service call returns the square with size varying according to how much the map is zoomed in.

On the other hand, where latitude and longitude are used as the spatial reference system then  a point location is stored with no indication of precision.

In recent correspondence with John van Breda about this he replied to me "We ought to add a precision field. The geom stored is indeed a point which is probably preferable to trying to store a circle with radius."

However, if a point and precision is stored then I guess for every spatial query it would have to be expanded to a polygon to see if it intersects with the query area which may be more trouble than storing a polygon in the first place.

Jim Bacon.

3

Re: How to deal with fuzzines of location data

PostGIS supports buffering of polygons or points on the fly, so we have the option of either doing this as part of the query itself using the proposed precision field as an input parameter, or to add a second geometry to store the buffered version for reporting against. As it stands using Daniel's custom attribute approach it ought to be possible to write a report that joins to this sample attribute then call st_buffer(geom, precision attribute value) in the SQL.

John van Breda
Biodiverse IT

4

Re: How to deal with fuzzines of location data

Hi,

thanks for the hint how to use our precision attribute.
It's good to hear that you plan to support this directly in the database schema. I wonder if creating the buffer on the fly at spatial queries is not too expensive? Check If a buffer is needed, creating it and the testing if its in the area is in the area I'm searching for. I think it would be more economic saving the buffered polygon in an extra column since spatial queries are quite common.

Regards

Daniel

5

Re: How to deal with fuzzines of location data

Yes, I think you are probably right. Buffering a single query polygon to search within is quite different to buffering the entire dataset of occurrences.
Best wishes

John van Breda
Biodiverse IT

6 (edited by Jim Bacon 15-11-2011 10:43:00)

Re: How to deal with fuzzines of location data

Would another option be to create an sref module, as we have for other coordinate systems,  that has an sref_to_wkt function that returns a polygon which is more of less circular, centred on the selected point? When setting up the client website you could choose this for fuzzy lat/long data entry. This would not need any change to the core code or data structure.

If this were created Daniel could then run an update on his dataset to convert it to this form. The entered_sref and the fuzzy fields would retain the original information but the geom field would now contain a polygon rather than a point. We would need to invent a value for sref_system.

Jim Bacon.

7

Re: How to deal with fuzzines of location data

I'm not sure - I think that the fuzziness issue is fairly generic, maybe even applying to grids, and certainly applying to other point based systems including those used in Germany. Plus your solution still requires the addition of a fuzziness field to the data structures.

If we add a field for fuzziness (should check what the NBN Gateway calls that field except the documentation seems to be offline) to samples, then it is simply a matter of including this value in submissions when required. We can then either buffer the existing geom field in the model code when this value is present, or add a second geom for the buffered version.

John van Breda
Biodiverse IT

8

Re: How to deal with fuzzines of location data

My thought was that the fuzziness could be derived from the size of the geom polygon rather than adding a new field to the data structure - indeed I have already had to do exactly this. It wasn't very pretty.

I can see the merit of doing this once rather than many times for each  point-based system too.

It is a required field in the NBN exchange format even when using a grid. (See Precision field in http://www.nbn.org.uk/Guidebooks/Data/A-beginners-guide-to-sharing-data/The-NBN-Exchange-format.aspx)

So I think you are right to add a Precision field to samples but do you need a second geom field? You have stored the entered_sref and, if the Precision is 0 then the geom could be a point and if the Presision is non-zero then the geom could be a polygon.

Jim Bacon

9

Re: How to deal with fuzzines of location data

Agreed. So, the spec is:
1) Add precision field to the samples table.
2) For grid systems, this precision is the size of the grid square in metres.
3) For point systems, this precision reflects the possible inaccuracy of the point in metres.
4) When entering a point reference, there should be an extra input box for the precision. There should also be an option to set a default, or to set a default and hide the box.

Anything I've missed?

John van Breda
Biodiverse IT

10

Re: How to deal with fuzzines of location data

johnvanbreda wrote:

Agreed. So, the spec is:
1) Add precision field to the samples table.
2) For grid systems, this precision is the size of the grid square in metres.
3) For point systems, this precision reflects the possible inaccuracy of the point in metres

I'm not sure if I understand it right, what is the size of the grid square? The longest distance form the centre to a corner? This would be fitting to queries like "only show me samples which are at least 5 meter precise, but it would not allow to say "I'm not exactly sure if the sample is in this grid square or it's west neighbour". But I have no Idea how to express the second either.

johnvanbreda wrote:

4) When entering a point reference, there should be an extra input box for the precision. There should also be an option to set a default, or to set a default and hide the box.

Anything I've missed?

The decision if we should save the buffered geom extra, or only the buffered geom. I would prefer the first, because then you can use the point for queries where the user don't want to use the precision information.

Best wishes

Daniel