1

Topic: ALA/GBIF data processing.

I came across this paper the other day, it may have implications ofr the NBN at some point.

"In an effort to improve the quality of biodiversity records, the Atlas of Living Australia (ALA) and the Global Biodiversity Information Facility (GBIF) use automated data processing to check individual data items. The records are provided to the ALA and GBIF by museums, herbaria and other biodiversity data sources.

However, an independent analysis of such records reports that ALA and GBIF data processing also leads to data loss and unjustified changes in scientific names"

Mesibov R (2018) An audit of some filtering effects in aggregated occurrence records. ZooKeys 751: 129-146. https://doi.org/10.3897/zookeys.751.24791

2

Re: ALA/GBIF data processing.

Many thanks for letting us know about the paper. It is really useful. I need to go through the paper in detail but we have seen some of these problems in the Atlas, around the processing of fields and what value (processed or unprocessed) is available in the downloads, and also empty processed values when the original record has a value.

Hopefully we don't have the problem of unjustified changes in scientific names because of the UKSI, all most all records are supplied with the UKSI TVK.

Thanks again, Sophie

3

Re: ALA/GBIF data processing.

When I saw Matt had posted this paper, I had hoped that the UKSI would be helping avoid the name problem with the NBN Atlas - great to hear that this is the case. It's easy to take the UKSI and TVKs for granted, but it is such a critical database for managing biodiversity data.

-----------------
Teresa Frost | Wetland Bird Survey National Organiser | BTO
Other hat  | National Forum for Biological Recording Council
(Old hats  | NBN Board, ALERC Board, CBDC, KMBRC)