Topic: Accuracy of data migration from Gateway to Atlas
Currently I am not aware of any information describing the Atlas data model.
Nor am I aware of any information describing the migration of data from the Gateway to the Atlas and the consequences of any differences in the data model.
I have previously speculated on a potential loss of accuracy in location information.
Now I have an indication that there may be a loss of information around date recording so I thought I would start a new thread where we can collate similar findings. I would like to see this superseded by some accurate documentation.
The date format (used across the Gateway, Recorder 6 and Indicia) consists of a start date, end date and date type. This was documented in the Guide to the NBN Exchange Format
As far as I can divine, the Atlas uses three fields for day, month and year. Here is a table listing examples of records of all date types and their date values on both Gateway and Atlas.
+-----------+----+------------+------------+--------------------------------------+------+----+----+
| Gateway | Atlas |
+-----------+----+------------+------------+--------------------------------------+------+----+----+
| O | T | S | E | U | Y | M | D |
| b | y | t | n | u | e | o | a |
| s. | p | a | d | i | a | n | y |
| I | e | r | | d | r | t | |
| D | | t | | | | h | |
+-----------+----+------------+------------+--------------------------------------+------+----+----+
| 225677268 | D | 1929-06-19 | 1929-06-19 | 357af461-459c-416b-a232-ae690d11d519 | 1929 | 06 | 19 |
+-----------+----+------------+------------+--------------------------------------+------+----+----+
| 482518890 | DD | 2011-10-02 | 2011-10-05 | 9b04e7f1-6520-4171-930b-246c06a6e465 | 2011 | 10 | |
+-----------+----+------------+------------+--------------------------------------+------+----+----+
| 225732743 | O | 1960-06-01 | 1960-06-30 | 8306a297-c917-441e-a076-efcac7c90fe5 | 1960 | 06 | |
+-----------+----+------------+------------+--------------------------------------+------+----+----+
| 546763565 | OO | 2003-06-01 | 2003-08-31 | 7774b804-943f-43f5-81d0-1dca96383c7b | 2003 | 06 | |
+-----------+----+------------+------------+--------------------------------------+------+----+----+
| 225694397 | Y | 1963-01-01 | 1963-12-31 | 0fd1c74c-3e08-4d0e-9556-860b33bc823a | 1963 | 01 | |
+-----------+----+------------+------------+--------------------------------------+------+----+----+
| 225690837 | YY | 1736-01-01 | 1848-12-31 | fd89add7-e7e7-4b22-adab-6519b3fb1404 | 1736 | 01 | 01 |
+-----------+----+------------+------------+--------------------------------------+------+----+----+
| 225695129 | -Y | | 1958-12-31 | 76842205-6e88-47f2-a182-3345b32f4bdd | | | |
+-----------+----+------------+------------+--------------------------------------+------+----+----+
| 117364499 | ND | | | 05fcc56a-d4eb-45f9-92af-3965ac220480 | | | |
+-----------+----+------------+------------+--------------------------------------+------+----+----+
| 479988466 | U | | | a610d79d-ce92-4b01-96fc-aa51223100a5 | | | |
+-----------+----+------------+------------+--------------------------------------+------+----+----+
I can't be sure if these examples are typical but what I see here is that only types D and O (and I suppose ND and U) convert with accuracy.
DD suffers a loss of accuracy which is to be expected based upon my guess at the data model.
OO, Y, and YY suffer an unreasonable increase in precision.
-Y cannot be represented and the date is lost.
The original values from the Gateway are maintained in the raw object of the Atlas occurrence but I cannot see index fields for them so they cannot be filtered against.
Jim Bacon.