1

Topic: Duplicate Data

Sorry to post again!

I know that there is an xml report to identify duplicates in the database but is there a way of batch deleting these duplicates or does each one have to be found individually?

Many Thanks,

Sophie

2

Re: Duplicate Data

Hi Mike,

Thanks for sending that over, but it doesn't seem to want to open for me?

S

3

Re: Duplicate Data

Hi

The batch update is at www.Lfield.co.uk/downloads/JNCCDel7Record.zip  It is a while since I looked at it and I will need to check exactly how it works and what it is doing.

Mike

Mike Weideli

4

Re: Duplicate Data

Hi Mike,

I tried opening it but Recorder is saying that it cannot be parsed, but I realise you are double checking it at the moment :)

Thanks again,

S

5

Re: Duplicate Data

You will need to unzip the file and put it in the Batch Update folder. This works for me, but if this is what you did and it hasn't worked let me know. The Batch Update (Found under Delete in the menu list) need a list of Taxon_Occurrence keys to delete. These can come from a csv file generated by output from the  Duplicate Report, however, this shows all the duplicated keys, so you would manually need to select the records you don't want. We haven't provided a standard way of deleting duplicates, because there is no safe automated way of deciding which records to keep or delete. However, if  you have some rules which can be applied then I can probably automate the process or find an easier way of doing it.

Mike Weideli

6

Re: Duplicate Data

Hi Mike,

Sorry, my mistake, it is under batch updates and all seems to be in order. We'll definitely just have to be very careful about sending the csv file through as Recorder seems to have picked on many 'duplicates' that contain different information outside of those specified by the report, thank you very much though as this will make is much easier!

This system probably does work best for us as it forces you to scrutinize our records more closely, you've been a great help, thank you :)

7

Re: Duplicate Data

Del7 will delete the ToCC keys listed and all the data below them (eg Determination and Abundances). It will then delete anything above them in the hierarchy where there are no remaining records. Eg it will delete all the sample which originally contained the TOCc key, but no longer have any occurrences. Please take care  with this and take a backup before running the delete and check the results carefully as there is no way back once you have started entering new data.

Mike Weideli