1

Re: Dodgy sort orders?

I recently noticed a few of our bird records appearing out-of-order (i.e., not in the correct taxonomic order) in reports. On closer inspection, I found that many species have a SORT_ORDER of 10600000000, which would indicate that these species don't have a SORT_CODE in the TAXON_LIST_ITEM table. However, it seems that they do have SORT_CODE, but for some reason, they're not being correctly generated when building the INDEX_TAXON_NAME.

Try running the following SQL to see what I mean:

USE NBNData; 
SELECT
  ITN.TAXON_LIST_ITEM_KEY,
  ITN.COMMON_NAME,
  ITN.PREFERRED_NAME,
  ITN.SORT_ORDER,
  TXG.SORT_ORDER,
  TLI.SORT_CODE
FROM
  INDEX_TAXON_NAME ITN
INNER JOIN
  TAXON_LIST_ITEM TLI ON
  ITN.TAXON_LIST_ITEM_KEY = TLI.TAXON_LIST_ITEM_KEY
INNER JOIN
  TAXON_VERSION TXV ON
  TLI.TAXON_VERSION_KEY = TXV.TAXON_VERSION_KEY
INNER JOIN
  TAXON_GROUP TXG ON
  TXV.OUTPUT_GROUP_KEY = TXG.TAXON_GROUP_KEY
WHERE
  TXG.TAXON_GROUP_NAME = 'bird' AND
  ITN.SORT_ORDER = '10600000000'

Here's the first few rows I get back (copy and paste into a text file, save it as a .CSV then open in Excel for optimal viewing):

TAXON_LIST_ITEM_KEY,COMMON_NAME,PREFERRED_NAME,SORT_ORDER,SORT_ORDER,SORT_CODE
NBNSYS0000001839,Aquila heliaca,Aquila heliaca,10600000000,106,85
NBNSYS0000002172,Aquila heliaca,Aquila heliaca,10600000000,106,62
NBNSYS0000002012,Threskiornis aethiopicus subsp. aethiopicus,Threskiornis aethiopicus subsp. aethiopicus,10600000000,106,118
NBNSYS0000002232,Columba palumbus subsp. azorica,Columba palumbus subsp. azorica,10600000000,106,122
NBNSYS0000001967,Gavia immer subsp. immer,Gavia immer subsp. immer,10600000000,106,73
Charles Roper
Digital Development Manager | Field Studies Council
http://www.field-studies-council.org | https://twitter.com/charlesroper | https://twitter.com/fsc_digital

2

Re: Dodgy sort orders?

There are quite a few other taxa that also have this problem. I assume it's something to do with the nameserver, no?

Test using this query:

USE NBNData; 
SELECT
  ITN.TAXON_LIST_ITEM_KEY,
  ITN.COMMON_NAME,
  ITN.PREFERRED_NAME,
  ITN.SORT_ORDER,
  TXG.SORT_ORDER,
  TLI.SORT_CODE
FROM
  INDEX_TAXON_NAME ITN
INNER JOIN
  TAXON_LIST_ITEM TLI ON
  ITN.TAXON_LIST_ITEM_KEY = TLI.TAXON_LIST_ITEM_KEY
INNER JOIN
  TAXON_VERSION TXV ON
  TLI.TAXON_VERSION_KEY = TXV.TAXON_VERSION_KEY
INNER JOIN
  TAXON_GROUP TXG ON
  TXV.OUTPUT_GROUP_KEY = TXG.TAXON_GROUP_KEY
WHERE
  RIGHT(ITN.SORT_ORDER,8) = '00000000' AND
  TLI.SORT_CODE IS NOT NULL
Charles Roper
Digital Development Manager | Field Studies Council
http://www.field-studies-council.org | https://twitter.com/charlesroper | https://twitter.com/fsc_digital

3

Re: Dodgy sort orders?

Hi Charles

Thanks for raising this - I'll look into it and get back to you.

Best wishes,

Sarah

Sarah Shaw
Biodiversity Information Assistant
JNCC

Sarah Shaw
Biodiversity Information Assistant
JNCC

4 (edited by Sarah Shaw 12-07-2007 12:32:44)

Re: Dodgy sort orders?

Hi Charles

Needs further investigation, but the key thing to note is that according to the TSD when building the Index_Taxon_Name table, the sort_order is derived from:

Sort Order from taxon_group table

&

the Sort_code from the taxon_list_item table using the recommended_taxon_list_item_key from the nameserver.

So in the case of TLIk: NBNSYS0000002090 'Hirundo atrocaerulea'

- the sort order from the taxon_group table is '106' (so far so good)

- the recommended_taxon_list_item_key from the Nameserver table is 'NHMSYS000053396'

- the sort_code for 'NHMSYS000053396' in the Taxon_list_item table is 'null' (not so good)

hence this particular species ends up with a sort_order of '10600000000' in the Index_Taxon_Name table.

Try running the following query instead:

USE NBNDATA;
SELECT
  ITN.TAXON_LIST_ITEM_KEY,
  ITN.COMMON_NAME,
  ITN.PREFERRED_NAME,
  ITN.SORT_ORDER,
  TXG.SORT_ORDER,
  TLI2.SORT_CODE
FROM
  INDEX_TAXON_NAME ITN
INNER JOIN
  TAXON_LIST_ITEM TLI ON
  ITN.TAXON_LIST_ITEM_KEY = TLI.TAXON_LIST_ITEM_KEY
INNER JOIN
  TAXON_VERSION TXV ON
  TLI.TAXON_VERSION_KEY = TXV.TAXON_VERSION_KEY
INNER JOIN
  TAXON_GROUP TXG ON
  TXV.OUTPUT_GROUP_KEY = TXG.TAXON_GROUP_KEY
INNER JOIN
    NAMESERVER N ON
    TXV.TAXON_VERSION_KEY = N.INPUT_TAXON_VERSION_KEY
INNER JOIN
    TAXON_LIST_ITEM TLI2 ON
    N.RECOMMENDED_TAXON_LIST_ITEM_KEY = TLI2.TAXON_LIST_ITEM_KEY
WHERE
  TXG.TAXON_GROUP_NAME = 'bird' AND
  ITN.SORT_ORDER = '10600000000'

Hope that makes sense.

It's my first delve into this particular area and i'm not sure why these taxon_list_item_keys don't have sort_orders.

John from the NHM may be able to shed some light on this?

Additional Note: I ran the above query in a 6.9.3 database. However, when I ran a query to find all the rows in Taxon_List_Item table with null sort codes (that are also recommended_taxon_list_item_keys in the Nameserver) in the most recent copy of the dictionary supplied by the NHM there are only 14 entries that potentially cause problems. However, I'll double check this in an updated copy of the Recorder 6 dictionary and also email John).

Kind regards,

Sarah

Sarah Shaw
Biodiversity Information Assistant
JNCC

Sarah Shaw
Biodiversity Information Assistant
JNCC

5

Re: Dodgy sort orders?

Thanks Sarah, that makes sense.

Charles Roper
Digital Development Manager | Field Studies Council
http://www.field-studies-council.org | https://twitter.com/charlesroper | https://twitter.com/fsc_digital