1 (edited by TonyP 06-12-2010 15:27:12)

Re: Wizard forgot my Matches

Well I spent 3 weeks doing many iterations of importing one large dataset. I committed all my matches, created new taxons where required.
At the end of the import I had 12,000 rejections. I found out what the problem was and cancelled the import. Made the correction and started the import.

There were about a dozen names it had forgotten saying there were duplicates. It has forgotten all the species I matched. Not one has been remembered.

I keep finding myself asking which bit [singular] of Recorder works? It took a week to assign those species names, checking for synonyms that Recorder doesn't know and adding in species that have been recorded since 1860.

Should I now be spending another week doing what I did last week only to have to do it all over again next time I get some data?

I just checked IW_Matched_Species and found 112 entries instead of the thousands I had committed last week. Isn't a database supposed to save records rather than cache them and forget later or even just save them?

As I think I've said before it's not really a wizard more a pain down one side.

I just checked and the first species that was not matched by any of the lists I tried is in IW_Matched_Species so quite clearly not much of a wizard.

Tony
PS a working version would be nice before I retire in 20yrs time.

Data Manger
Somerset Environmental Records Centre

2

Re: Wizard forgot my Matches

It looks to me as though the Wizard only remembers the matches which are done manually, not those where the match is against the list chosen at the top of the matching window. . For example if I select a specified list  and match against this and then have some unmatched species left, which I then match against oher lists, then only these extra matches are saved. Next time I match against the selected list it rematches the ones it did  automatically and uses also the matches I have done manually. This  matching  does not seem to happen if I choose a differnt list.  The table which holds the matches has the key for a taxon list in it  and its seems that this is  somehow this involved in the matching.  I don't  understand what is happening here, but it does seem to work as long as I take a consistent approach to matching and do not use different lists.  The main problem is that none of this is explained anywhere in the detail required.

Dave

3

Re: Wizard forgot my Matches

I would agree David not enough explanation.

I did find entries that had apparent matches made by me manually not being recognised. I also had entries in the where the match item key was Null. This is clearly wrong since a matched key is mandatory and therefore this should not be allowed. I suspect either a programming omission or due to the limitations on the original access database. MS SQL can have triggers set to prevent such happening.

I have to admit I had hoped that all the decisions I had made would make it into the stored information. As I don't appear to be able to search all the preffered lists at once I have to go through every preffered list one at a time and it seems I will have to do this every time. I guess they assumed one would only ever import data from one species group. It would help if one could use all the preffered lists especially as it is expected that these will replace the others in future, at least that seems to be the implication.

It still doesn't explain why it would not match those species that were there and apparently correctly filled out by the system.

I'm now onto locations and it has not remembered those either.

Data Manger
Somerset Environmental Records Centre

4 (edited by MikeWeideli 07-12-2010 23:51:54)

Re: Wizard forgot my Matches

The only matches which are saved are those where the users finds a match in a Dictionary. Those made automatically  are not saved.  When the  manually matches are saved the matched_checklist_key  is taken from the checklist currently in use at the time of the save. Where 'All preferred lists' has been chosen then the matched_check_list_key is null.  When search usd to get the saved taxa, it first looks at saved matches for the currently selcted dictionary list, then for any automatic matches against this lists. The way the matches are saved against different dictionaries can cause problems at this stage as the matches may not have been saved against the current list and will therefore not be accessed.

The approach which appears to works best is to do the search against 'All Preferred Lists'. As far as I can see this does what it says and finds the matches on preferred list, not just on the species to be matched, but on any synonmyms, identified in Index_taxon_name via NameServer.  Anything which is not automatically matched this way or any which are not considered right,   can then be matched manually against the appropriate list, but it is important not to change to the checklist (All Preferred).

If this process is followed every time then matching will pick up everything matched before and those new ones which can be matched automatically. Only those new entries which can't be matched automatically will need to be matched manually.

During the testing of the process I have noticed that delete/replace is not working for 'All Preferred Lists'. What is happening instead is that a new entry is being created. This means that subsequent matching of those species is arbitary.   This has been raised as a high priority bug, because there seems to be no easy way to get around this.

Mike Weideli

5

Re: Wizard forgot my Matches

This seems more complicated than it needs to be. Surely a species name should only have one match ? In what circumstances would you need alternatives based on the list you were using for the main matching?   

Dave

6

Re: Wizard forgot my Matches

Can I just say that I am not defending what is happening, just attempting to explain it. I rarely use the Wizard as most of the work I do is importing data where the Wizard can't cope or would take forever to set up, so I was not that familar with exactly how it worked until I looked into it yesterday. I suspect that making the matching work on a combination of taxon name and chosen dictionary is intended to make the matching more flexible. Some users don't like matching against the preferred list because the taxa allocated does not have the same name as  that originally recorded.  What they can do under the current system is to match a list which uses the older names against say the R3.3. list and attached the manual matches to that. Then for a list which uses more modern names thay can use a preferred list and attach perhaps a different set of manual matches to this.  The testing I have done suggests that apart from the one thing I picked up it is working, but I agree that the way it works can be confusing and that it could do with looking at in more detail. It would also help if the import wizard tables could be updated by a Batch Update as this would at least give users more control.

Mike Weideli

7

Re: Wizard forgot my Matches

I use a variety of lists for my matching. You have explained why matches sometimes appear to be lost, but I am not sure how best to use the facility to avoid this. Also I am unsure how I can correct what is already in the matching table. 

Dave

8 (edited by MikeWeideli 08-12-2010 21:08:38)

Re: Wizard forgot my Matches

I am still experimenting with this. See my post below following John's comments.

Mike Weideli

9

Re: Wizard forgot my Matches

Again, not defending anything but perhaps I can explain the rationale. Lets take a simple example - on one day you are importing a list of bird records, which includes a redshank. So you search against the birds list and match your names including. Next day you import a different list of plants, including a redshank, so you search and match against a plants list. If the system remembered matches across all lists, it would automatically match any homonyms and shared common names incorrectly. To avoid this happening a decision was taken during the design of the Import Wizard that matching can only assist you within the context of a list - as no matching is better than bad matching.
Now that we see how the Import Wizard is being used and how the dictionaries have evolved, there are opportunities to rethink this a little. For example, when you save a set of matches perhaps it could remember ALL the matches for everything, but will only reload the matches when you load that same file (or a file which has the same taxonomic context). So perhaps you could say what kind of data you are importing, e.g. plants, birds etc, then when you save matches all the matches are kept across all lists when you import the next file of the same type.
Just thinking aloud - would welcome any thoughts on how best this should be tackled, as it is obviuosly an important point.
Best Wishes

John van Breda
Biodiverse IT

10 (edited by MikeWeideli 08-12-2010 21:12:19)

Re: Wizard forgot my Matches

From John's explanation the way this should work with a  list of taxa which cover several groups is that you process each dctionary list involved  in turn. Use search to pick up the automatic matches, then do any manual matches for the list and commit these before moving onto the next list. This approach should  solve the homonym issue and mean that you will always get the matches you need as long as you follow the same procedure. The main thing is not to commit matches with the wrong dictionary list selected and to always to commit manually. If you leave the system to commit when the import is run then it will put all the matches against the last list selected.

Mike Weideli

11 (edited by RobLarge 09-12-2010 10:09:44)

Re: Wizard forgot my Matches

Seems to me that if an item cannot be matched automatically from the current list, the wizard should check all lists and present all possible matches as a list of options for the  user to select from. This list could include any previously saved manual matches for the same taxon name and should indicate the list each comes from (or preferably the taxon group to which it belongs).

In general though I think the whole concept of lists, although useful in the context of keeping the list providers happy, is unhelpful to most end-users and would best be concealed. I would rather be presented with a list of possible matches, with one (or more in the case of homonyms) preferred options indicated (along of course with the taxon group to which they belong).

Rob Large
Wildlife Sites Officer
Wiltshire & Swindon Biological Records Centre

12

Re: Wizard forgot my Matches

Hi Rob
In effect what you are suggesting is that the search against the preferred list becomes the default behaviour and that the ability to browse other lists during matching is made an "advanced" option so most users won't need to bother with it. Also that if there are duplicate matches, rather than no match being made, list the potentials along with their taxon group and let the user select?

John van Breda
Biodiverse IT

13

Re: Wizard forgot my Matches

Yeah I think that sums it up.

I think in most cases people will be more interested in the taxon group than the list anyway, although of course priority should be given to the preferred lists.

Rob Large
Wildlife Sites Officer
Wiltshire & Swindon Biological Records Centre