1 (edited by RobLarge 03-12-2009 11:08:52)

Re: More validation issues

Having fixed the problem we had with imports & exports recently (which was related to problems with the StdValLib.dll validation library, we thought we should check out the state of our data holding.

Using the Data Validation Toolkit beta on our whole recorder dataset revealed 40 invalid records (from a total of 700,000 + so not too bad really). Interestingly several of these predate the problem we just fixed, in that they were imported from Recorder 3 last year. All were related to bad spatial references. One survey event was somehow imported with no spatial reference at all, consequently it, and its only sample (of 12 records) have no spatial ref either. I didn't think this was even possible with Recorder 6.

By way of a check, I have just right-clicked on the survey containing this bad event and chose Revalidate record (which I assume uses the dll mentioned above). After a long wait it reported only 3 defective records in the survey, none of which were from the faulty sample (so the validation isn't picking up the lack of a spatial ref).

Stranger still, the three records identified, were reported as "The Determination Date cannot come before the Sample Date." However, in each case the determination date appears to exactly match the sample date, i.e. the records were in fact valid and the tool reported wrongly.

I edited the survey event in each case and retyped the exact same date, saved (and told it to update all child samples) and re-ran the validation check, but this made no difference. Closer inspection of the data tables revealed that in one example the determination had vague date (start , end and type) 34914, 34914, D while the survey event vague date fields were 34914, 0 and D (i.e. the vague date end field is invalid).

Clearly the validation check performed by the dll is not working properly and the fact that there are, I discover 311 survey events in the dataset which have the same problem (all dating from the Recorder 3 import and there are a further 33 survey events with vague dates 0,0,D which are probably also suspect), demonstrates that the validation checks were not working properly then either.

Also interestingly I discover that after re-entering the dates for each survey and saving as described above, they still have the same erroneous vague date fields.

So it would seem that the validation dll is untrustworthy as it cannot successfully detect spatial reference errors, nor can it report correctly the cause of a discrepancy in vague dates.

Furthermore, the data Validation Toolkit beta cannot detect defective vague dates in Recorder.

Incidentally, when I located the sample with the missing spatial refs and did the right-click validation on this alone, it passed validation. This sample has location name set to Unknown and no spatial reference and yet it passes validation.

I realise that these are historic problems relating to Recorder 3, but my confidence in Recorder 6 validation has been knocked a bit today.

Rob Large
Wildlife Sites Officer
Wiltshire & Swindon Biological Records Centre

2

Re: More validation issues

Recorder allows users to enter observations without grid references. The rules are that some location information must be entered, i.e. any combination of Location from the location hierarchy, Location Name (text/comment only) and Spatial Ref. Thus an observation with something in Location Name only is valid. Location Name was in Recorder 3 but not in Recorder 2000 when it first came out. It was added soon afterwards.

The date problems look like a variation on http://forums.nbn.org.uk/viewtopic.php?pid=4057#p4057 . It sounds as though the test is for the determination date to be on or after the start and end dates of the sample. Date problems are quite common in Recorder 3 so these problems are likely to have originated there.

Sally Rankin, JNCC Recorder Approved Expert
E-mail: s.rankin@btinternet.com
Telephone: 01491 578633
Mobile: 07941 207687

3

Re: More validation issues

Thanks Sally

I am quite sure that the date problem did originate in Recorder 3. I have fixed it now, at least for the records which didn't have unknown dates, they will take a bit more searching.

Out of interest, do you know why records are allowed without a proper spatial reference? My problem is that the Location Name field for the records in question says unknown. Not very helpful.

Personally I would like Recorder to have an option to bar all records without either Location or Spatial ref. I understand that for some applications a descriptive text location may be useful, but we would never accept records without a decent GR, it makes life too difficult later on.

Rob Large
Wildlife Sites Officer
Wiltshire & Swindon Biological Records Centre

4

Re: More validation issues

Hi Rob

The idea is to allow a record to be digitised with not very good spatial information. This could be "dead on the M27 nr Portsmouth", or it could be as a result of digitising literature or specimen information, where you might know the geographic region but not the spatial reference.

Perhaps there needs to be some configuration regarding validation and checking - i.e. an option to prevent entry of records without spatial reference, and an option to prevent unchecking records without prompting etc.

John van Breda
Biodiverse IT

5

Re: More validation issues

That would be more or less what I have in mind yes John

Rob Large
Wildlife Sites Officer
Wiltshire & Swindon Biological Records Centre