1

Re: Importing using Record ID and Site ID

There is a major problem with this.
As a scheme organiser I manage different kinds of incoming datasets.
I have datasets from various individuals, groups, field surveys, museum collections, published records and I have my own ongoing collection.
If I use the "Site ID + Unique index + Record ID" (8:2:6) system to manage the imported records from my personal collection then the Taxon_occurrence_key rather usefully has the same number (nearly) as the one on all my specimen labels. This tie-in between collections and data published on the NBN Gateway will prove invaluable in the future.
However I cannot import any other dataset for fear (justified as I've discovered) that they will use up the Taxon_occurrence_key s I need for my specimens.
In my personal databases I had allowed for this eventuality by assigning the "Unique index" part of the above formula to all these various different sources but the import system has taken these two digits for other purposes (see next posting for an idea of how this works).
As a consequence we now have a restricted set of choices:
1. A dataset limit of 999,999 (most people aren't clever enough to use letters) N.B. some have exceeded this already
2. Using your R6 solely for one sequentially numbered collection
3. Resolving 2. by requesting Site ID s for each of the datasets you manage (with the consequence that you cannot now edit the records as they don't belong to your Site ID)
4. A problem. if one were to attempt any form of resolution, that records already shared with others would receive new Taxon_occurrence_key s and thus become duplicates in some other R6 user's sytem.

The easiest fix for this would seem to be an easy method to change between all the Site ID s one finishes up managing under option 3.

2

Re: Importing using Record ID and Site ID

My explanation of Taxon_occurrence_key constructions
Recorder import uses the following construction:

Site ID + Unique index + Record ID" (8:2:6)

giving us a key

ADS00001nn999999

in fact, without the "nn" part what we have is a key which is unique right down to the level of Sample and the "nn" now enumerates all the individual taxa within that Sample thus

ADS0000101000001 - taxon 1 at Sample 1
ADS0000102000001 - taxon 2 at Sample 1
ADS0000103000001 - taxon 3 at Sample 1
ADS0000104000001 - taxon 4 at Sample 1
ADS0000101000002 - taxon 1 at Sample 2

Whereas the managing of datasets by a recording scheme as discussed above requires

ADS00001AA000001 - taxon 1 of dataset AA
ADS00001AA000002 - taxon 2 of dataset AA
ADS00001AA000003 - taxon 3 of dataset AA
ADS00001AA000004 - taxon 4 of dataset AA
ADS00001BB000001 - taxon 1 of dataset BB

Datasets from contributors to recording schemes do have this kind of format, not all of them, but many do have the requirement for sequential numbering of their collections and may also try to use Recorder in the same way as outlined above.

3

Re: Importing using Record ID and Site ID

I don't quite understand what you're doing here Darwyn (probably just me being thick).

When you say "If I use the "Site ID + Unique index + Record ID" (8:2:6) system to manage the imported records from my personal collection", do you mean you're adding a code that follows this format to data held outside of Recorder?

Are you then trying to import it into Recorder?

I can't quite work out what "the Taxon_occurrence_key rather usefully has the same number (nearly) as the one on all my specimen labels" means either. Could you possibly provide a step-by-step example of exactly what you're doing or hope to achieve? I'm just not getting it. :)

Charles Roper
Digital Development Manager | Field Studies Council
http://www.field-studies-council.org | https://twitter.com/charlesroper | https://twitter.com/fsc_digital

4

Re: Importing using Record ID and Site ID

Go back several years to descriptions of the Globally Unique Identifier principle (which you'll find in Charles Copp's description of the data model and also in the Recorder Help files) and you'll find the following simple description:

Each copy of Recorder is given a unique identifier consisting of an 8-character string of upper case letters (A..Z) and digits (0..9) - i.e. 36 possible characters. Each table also has a running identifier similarly consisting of an 8-character string of upper case letters and digits.

Acting upon this, many of us adopted this principle and assigned an 8 character unique identifier to our own records in what the new Recorder 6 Help file terms a "source system" and have been doing so since the last century. It was such a good principle.
At the time that the MapMate GUIs were being deciphered so as to allow exchange with Recorder there was a promise that the facility to import using these 8 character unique identifiers would be implemented in the import system. Seven years of effort were put into ensuring that source systems all had 8 character unique identifiers. But disappointingly this never seemed to get implemented.
Until now, with R6.10 and Sally Rankin's notes on the "Import Wizard - data format" page on the new Recorder Help file (the one on the CD):

Record ID: An identifier for the record supplied by the source system. This allows an import to run more than once without creating duplicate records.
and
Site ID: An 8-character NBN globally unique identifier for the system from which the records originated.

Now the opportunity arises to take full advantage of the effort I've put into studying and implementing the data model and I can import all my Edinburgh Museum records (identified EM000001 to EM000965), Oldham Museum (OM000001 to OM003561), North West Hoverfly Recording group (HR000001 to HR015671), Invertebrate Site Register (IS000001 to IS000651), personal records (FR000001 to FR019198) into my copy of Recorder 6.
Seemingly not. Somewhere along the line the documented Site ID:running identifier::8:8 has been changed to Site ID:Unique index:running identifier::8:2:6 - I missed the documentation for this substantial change in the data model principles - was it ever documented.
You can see what happens next:
Working from my most recent personal records I successfully import using just the 019198 part of my identifier but as soon as I try to import using the lower numbers I find that they are already occupied by records I've already imported using systems prior to this utility and they try to overwrite entirely different records.
You can easily check this for yourself. Find an existing taxon occurrence of the form ADS00000100nnnnnn, write a 1 line spreadsheet for import with a Site ID of ADS000001 and a Record ID of nnnnnn. You will find the new record will attempt to overwrite the old one (you can stop it at the last stage, fortunately)
So in answer to your questions:
1. No, I added a code that followed the Site ID:running identifier::8:8, as published - and have been doing so for years as this was how it was explained in the model.
2. Yes
3. Recorder isn't a curatorial system. One of the primary needs of collecting entomologists is for data labels. One reason for the non-take-up of Recorder is that it doesn't provide the facilities to print a 13mm by 10mm data label or cater for specimens which are without identifications. Clearly these are conceptually different applications (and a huge topic which I cannot go into here). Therefore many use Spreadsheets or Access for their curatorial systems to allow them to do these kind of things. Specifically I have on my data label "FR019198" whilst the Recorder Taxon_occurrence_key is "ADS00000100019198" - near enough to be of great help in rapidly (GoTo Key) locating a record from the specimen, invaluable in  relating specimen to record (notwithstanding the Specimen module) and essential in ensuring that updates in the curatorial system get updated in Recorder 6 when I revise identifications in the curatorial system - the point that Sally makes in the Help file.
Not a straightforward topic I agree, Charles, but the ramifications of the ?undocumented change in the data model are considerable. We need a fix for this one way or another.

5

Re: Importing using Record ID and Site ID

I'm not quite sure I follow Darwyn. Recorder has never changed the principles for generating primary keys for records. Here's how it works.

1) Each installation gets assigned an 8 character site ID with the license. There is no rule as to how these are generated, except that they must be unique. To ensure that they remain unique they are all provided by JNCC.

2) There is a table in the database called LAST_KEY, initially empty but will contain a row for each database table.

3) When the first record is inserted into the database in any given table, the system checks for a LAST_KEY record for that table. If none is found then the ID of '00000000' is selected, otherwise the value in the table is incremented by one and used as the ID. When incrementing this id, it first goes 0-9, then A-Z for each digit, e.g. 00000000-00000009, then 0000000A-0000000Z, then 00000020-00000029 etc.

4) The new ID is then written back to the LAST_KEY table.

4) All 8 digits of this ID are used making trillions of combinations.

5) The ID is concatenated with the Site ID to make the primary key value for the new record. Therefore we have a key guaranteed to be globally unique which also tells us the database that the record originated from.

6) Recorder has not been designed to allow the user to generate their own primary keys for records, unless the user follows these rules. Therefore it will attempt to overwrite any manually generated taxon occurrence records with your own ID system.

7) Really, the NBN Key is an internal unique identifier and should not be expected to tie into specimen registration numbers, these should be stored elsewhere. As you rightly point out, Recorder is not really a collections management system but if you are interested the Collections Module is a very powerful addin developed with the Natural History Museum in Luxembourg which does cover specimen numbering and labelling properly.

Best Wishes

John van Breda
Biodiverse IT

6

Re: Importing using Record ID and Site ID

Thanks, John. You've expanded on an element of my notes and that would have been all that I needed to know until I read Sally's notes on Importing in the new Recorder Help.
It does actually work as she says and refuses Record ID s of more than 6 digits. It is there so presumably it was written in to the Import routines at some point in time with the intention of tying in with sequential numbering in a "source system" and I have investigated it for the reasons stated above.
I've contacted her and asked for clarification.
Charles has spoken about the Luxembourg add-in, sounds very interesting.
About a tenner and in time for Christmas would be nice.

7

Re: Importing using Record ID and Site ID

When the Import Wizard was rewritten for Recorder 6 the ability to import records with their own Site ID and Record ID was introduced. When I revised the help earlier this year I updated the column types that the wizard would import with information from the Technical System Design document produced by Dorset Software. In the version I have Record ID is covered in 1.18.2.11 (page 55) and Site ID in 1.18.2.14. For details see Help in Recorder 6 – Tasks on the Contents tab – Exchanging data – Import data – Import Wizard – Data format, and scroll down to the table of supported columns

In the example I used of importing a record with Record ID 77 and Site ID TEST1234 the taxon occurrence created would have a key of TEST123400000077. If this record had 3 sets of measurement data the keys of the taxon occurrence data entries created would be TEST123400000077, TEST123401000077 and TEST123402000077, i.e. the 2 character unique index is being reserved to enable unique keys to be generated for tables that may need multiple entries for the taxon occurrence. This principle will ensure that the same keys will be created no matter which copy of Recorder 6 the data is imported into, thus avoiding duplication when data is exchanged.

I have no recollection of seeing this documented in anything that is made available to users prior to my upgrades to the Help but Darwyn’s predicament illustrates the need for better documentation. This facility was in Recorder 6 when it first went on general release in September 2005.

As I see it, for Darwyn to use this facility to import records from the datasets he mentions in his post of 17/12/07 at 19.15 above, he would need a separate Site ID for each data set and then to drop the first 2 characters from each record identifier, e.g. take the EM out of EM000001 to EM000965.

Please note that this facility should NOT be used unless you check with JNCC that the Site ID you are planning to use is unique.

Sally Rankin, JNCC Recorder Approved Expert
E-mail: s.rankin@btinternet.com
Telephone: 01491 578633
Mobile: 07941 207687

8

Re: Importing using Record ID and Site ID

Thanks, Sally.
For those of you who are taking notes.
Turns out my interpretation of how Recorder now allocates the various parts of the Globally Unique Identifier wasn't correct in my posting of 2007-12-17 09:19:09. My third paragraph & table, to fit in with what Sally says above, should have read:

in fact, without the "nn" part, what we have is a key which is unique right down to the level of Taxon_occurrence and the "nn" now enumerates all the various Measurements within that Taxon_occurrence thus:

ADS0000101000001 - Measurement 1 of Taxon_occurrence 1
ADS0000102000001 - Measurement 2 of Taxon_occurrence 1
ADS0000103000001 - Measurement 3 of Taxon_occurrence 1
ADS0000104000001 - Measurement 4 of Taxon_occurrence 1
ADS0000101000002 - Measurement 1 of Taxon_occurrence 2

I'm now investigating solution 3. from my 2007-12-17 09:01:17 posting.
I wonder how JNCC are going to react to my request for a whole bunch of Site IDs - watch this space.

9

Re: Importing using Record ID and Site ID

Hi

Took a while to get my head around (still not sure I am quite there!) what Darwyn is trying to do.

Although there is the facility in Recorder 6 to use your own Site ID and Record ID's to import records I would advise users against using this extensively. This functionality needs more rigorous testing before I would recommend using it regularly, and managing unique keys that have 'meaning' (i.e. are used to relate to external database) can potentially get quite messy. We have provided Darwyn with a set of Site ID's to use for these purposes only but have advised to proceed with caution! However, he's said that the database he is using this approach with isn't 'mission critical' and that if he comes across anything of concern will post it here. Generally I would recommend when ever possible using other alternative fields to link to external datasets.

Problem with creating unique keys using MapMate keys

Unfortunately this also links in to some issues we came across very recently with importing MapMate data, namely where the unique keys created are not being padded out in Recorder 6 in the same way as they were in Recorder 2002 (i.e. they are now being padded out with zero's rather than E's). We have therefore raised a CCN in the next version that will address the following;

- Pad out unique keys with E's not zero's
- Keys created are in uppercase
- Survey_event, sample, survey_event_recorder etc. keys should be created using the key of the first record (lowest) from a unique batch rather than the key of the last record
- With multiple recorder keys the "1" should be added in the first character position of the Record ID instead of the third.

If you have any queries regarding this please post them here or email me.

Many thanks,

Sarah Shaw
Biodiversity Information Assistant
JNCC