Murdo, see responses in line below:
Can you please explain, however, the order and choice of fields in the download, and the other matters mentioned below?
-So far, the order of fields has not currently been given much thought. We are focusing our efforts on functionality and content at the moment, and will come back to more minor issues such as field order in due course. You will see below that we have adjusted the field order for OSGR and Licences. For now, we assume that it is simple enough for users to move columns around into a useful order for their specific needs after download, though we can of course offer advice if required on how to do this.
The binomial appears three times.
- The system is setup to deal with just names being provided i.e. without GUIDs. Often these names are in different formats (authorships, abbreviations etc). GUIDs are obviously a better choice for data exchange. So what we have is 1) the name as provided, then 2) names we’ve matched to (admittedly redundant when records have a taxon version key) and the species name. The last 3) is for when users download the data for a genus, then the species will of course be different, and if the record is for a subspecies, then (1) and (2) will be a trinomial, while (3) is the bionomial - hopefully useful for sorting.
The vernacular appears twice.
-This is displaying the provided vs the matched name. e.g. a record provided with “Red Squirrel” will show ‘Red Squirrel’ as well as matched with the taxon with the preferred name “Eurasian Red Squirrel”
The lat/long appears twice (4 fields in all, although in the file I have downloaded two are blank).
-We’ve got ‘latitude - original’, ‘longitude - original’ and ‘latitude - processed’ and ‘longitude - processed’. These are the original values (which can have different formatting to a decimal format) and parsed to decimal values.
The field ‘Species Inventory GUID’ appears to contain what we have previously called the TVK.
-This is on our list of things to edit, and we will be changing this to Taxon Version Key in due course.
I am told for every record that it refers to the UK and Scotland (I might have guessed that from the title ‘Atlas of Living Scotland’ and the knowledge that Scotland is still in the UK). Why?
-This will be useful for when downloading from the NBN Atlas that will cover multiple countries in the UK and British Isles. The user may want to filter out certain countries, and having this information in the download will be very useful for such tasks. We are using a World country layer which is an ISO country lists and a layer for the countries within the UK.
Despite all that redundancy, the very useful (some would say vitally important) determiner, comment, record type, abundance (but not sex) and other record attributes are absent, despite their lack having been notified months ago.
-This is in our project plan and we will work on getting these attributes added in due course.
Most people wanting to inspect a record want in the first instance the ‘what, where, when and who’. To get that, we need to extract (manually) cols 3, 4, 16, 25, 28, 30, 36. Why was the current field order chosen? I assume that there is a good positive reason, for example, behind the decision to place 19 fields between ‘Locality’ and ‘OSGR’, but it escapes me.
-This is just the default order that the csv has downloaded as. We will be working on editing the order as time goes on as these can easily be reorded. We will be posing this question to the steering group to advise the order. In the interim we have moved the position of some of these columns so that the OSGR’s are near the geospatial info and the licence column is the first column.
The last five fields include as Boolean values these three which are at best obscure: ‘Occurrence status assumed to be present’ - what does this mean (assumed not to be a zero abundance/absence record perhaps)?;
-Your interpretation is correct. Occurrence status is the term in use within the international standard darwin core.
‘Name not in national checklists’ - this would seem to be obvious, but as all the records in my file have this as TRUE, telling me that the species is NOT in a national checklist despite all evidence to the contrary, I have obviously missed something.
-Thank you for spotting this bug, this has now been fixed.
‘Precision / range mismatch’ – what does this mean?
-This shows if the coordinatePrecision (see darwin core) is inconsistent with the record provided. e.g. if you say coordinatePrecision=0.00001 but then the coordinates are -3.12, 54.12. See http://rs.tdwg.org/dwc/terms/#coordinatePrecision This attribute is flagged if the value is bad, as it should be between 0 and 1. We can appreciate the confusion here as the coordinatePrecision is provided but it in the wrong format, i.e 100m. Will discuss with developer to fix in due course.