26

Re: Version 6.13 released

Ian & Mike

Would the increase in processing time not be exponential? If it takes 20 mins to check Ian's ~3000 records against each other, doubling the number of records might increase the processing time by 4 (3000*3000/2=4.5m comparisons, 6000*6000/2=18m), which would give over 200 hours for his complete dataset.

If the query needs to be run on subdivisions of the dataset, can it be run against locations?

Gordon Barker
Biological Survey Data Manager
National Trust

27

Re: Version 6.13 released

Ian

I have put a revised version of report H1 (H1a) on my web site at www.Lfield.co.uk/downloads/JNCC_H1a_Duplicates.zip
If this is downloaded and unzipped into the report folder it will give a new report H1a. This works much faster than  the previous version and has the Survey Event date and  the sample date in the output. Clicking on a specific Taxon Occurrence or Sample Key  in the report will take you directly to the record. You can also use the goto key addin to get to the record.   It produces results on 360,000 records in about 10 minutes, but the time will vary depending also on the number of duplicates found. Duplicates should appear together in date order. Please give it a try and let me know if you encounter any problems.

Mike Weideli

28

Re: Version 6.13 released

Hi Mike

Thanks for your reply. I shall see if I can find the 'goto key addin' and try it out. I was trying the other duplicate report last night, which seems to cause the computer to behave in a different way - the Page File use much more memory than the report H1 but the CPU usage was much lower. This second report failed as Drive C: reached the capacity of the partition. I had to do some deleting of less than necessary programs to get some space...

I can well imagine that a check of 360,000 records would take ages. Presumably the time for the report to run increases exponentially as the number of records increases? I look forward to hearing about any new version you produce.

All the best, Ian

29

Re: Version 6.13 released

Ian

I have put two replacements reports on my website. These can be downloaded at www.lfield.co.uk/downloads/JNCC_Duplicates.zip. The replacement  reports works much faster. The replacement for H1 (H1a)  runs on 360,000 records in about 10 minutes, with 5000 duplicates. It may take a bit longer if there are more duplicates. The replacement for H2 (H2a) has more work to do but still runs in 20 minutes or so. From the report you can get to the records by clicking on either a Taxon_Occurrence_key or a Sample_Key. This will take you directly to the record. I have also added the Survey Event date which will make finding the records in the Observation hierarchy easier.  The goto key addin can also be used. 

Unzip the download into the reports folder of Recorder 6 and you should get two extra reports H1a and H2a. Let me know if you have any problems with these. If theyu are Ok I will make them more widely avaialable.

Mike Weideli

30

Re: Version 6.13 released

Hi Mike

I have had a look round in R6.13 and in the Help and can't find a 'goto key' anything. Where is it, please?

Presumably the search to compare records, if done for all records must grow exponentially with more records to be searched. However, presumably not all records need to be compared as initially it starts as looking for:

Same place, same date,

then looking for same observer, then same species, then... to whatever detail you search for.

Including Survey Event date in the report would be very helpful and I look forward to hearing about any developments.

Cheers, Ian

31

Re: Version 6.13 released

The GoToKey add-in is on the installation CDs but I would recommend only using the version on the v6.10 or v6.13 CDs as a number of the add-ins were revised for v6.10. I thought JNCC would upload them for users who bought earlier versions of Recorder 6 but I don't recollect seeing anything to indicate that this had been done so I have uploaded the GoToKey add-in from the v6.13 installation CD to http://forums.nbn.org.uk/uploaded/GoToKey.zip. The ReadMe.txt file in the zip file tells you how to install and use it.

Sally Rankin, JNCC Recorder Approved Expert
E-mail: s.rankin@btinternet.com
Telephone: 01491 578633
Mobile: 07941 207687

32

Re: Version 6.13 released

I have added a page now for the addins. See:
http://forums.nbn.org.uk/viewtopic.php?pid=3228
Steve

33

Re: Version 6.13 released

Hi Mike, Sally and Steve

Many thanks for your postings. I now have the GoTo addin, added and have used it - so thanks for that. I also added the Phenology add-in which looks very interesting. I couldn't find a way of entering a start date and end date for the search, eg to see the pattern of records within one year, to be able to compare with another - is that possible? Currently, it looks as if the graph is based on all records that are held, regardless of year.

Duplicate report 1a is, indeed, much faster, but it identified 12,400+ records out of a database of c 84,000 records! I was thinking that I had not been that careless on importing data, but then a thought struck me. 'Pair' is one of the allowable 'Sex/stage' entries for import. In the early days I found the data supplied by observers was ambiguous (eg someone would make a comment about a pair, but put 2 in the abundance column). To avoid this problem I have asked all observers to enter males and females on separate lines. So, within the parameters of the duplicate report 1a, every time I have an observer who has seen and recorded males and females separately, come up as possible duplicates. I printed the report to see if my old skills with BBC BASIC then MS QuickBASIC would help me to sort out how to add the search for Sex/ stage and then print that. The only thing I did manage to change successfully was the order in which the variables are printed in the final report - a rather trivial change. It is frustrating not having the skills to do more with XML reports. Is there any possibility of adding Sex/stage to the search? It is a mandatory field for importing, so everybody should have an entry in that field...

The duplicate report will be extremely helpful for cleaning up the data held in R6. If it could choose rather fewer records it would be more likely that we users could do the cleaning up.

Again many thanks, Mike, for what you have done.

Cheers, Ian