1

Re: Grouping of Surveys in Recorder

As suggested, reposting this idea to Feature Requests from General Chat. http://forums.nbn.org.uk/viewtopic.php?id=43 for full build-up.

charlesr wrote:

This is one of those features that I'd like to see go on the wishlist - the categorisation of surveys within Recorder so that sets of surveys can be grouped together in some sort of hierarchical folder-like structure. This would then make it not only very easy to keep distinct sets of data seperate from each other, but would also allow us to combine them just as easily. The ability to 'fence-off' certain sets of data is become increasingly important for us. For instance, there is an increase in the number of cross-county-border enquiries we need to do. This requires obtaining records from other centres, so we'd like to be able put other counties' data into their own folders.

Gordon Barker wrote:

This is also something I would like to see, to allow us to group surveys, possibly imported from different organisations, separately from our own dataset, keep the whole "filing system" less cluttered and make items in the survey/event hierarchy easier to find. I would prefer this top level of survey groups not to be locked to the ID number, for maximum flexibility.

Gordon

How radical a reorganisation would this actually require?

Gordon Barker
Biological Survey Data Manager
National Trust

2

Re: Grouping of Surveys in Recorder

Before I think about the costs, am I right in saying that what you want is an ability to created a simple folder hierarchy inside the Observations window, into which you can drag your surveys?  Also, is this folder hierarchy a personal thing which would not be exported, or should the position within the hierarchy be an exportable attribute?

John van Breda
Biodiverse IT

3

Re: Grouping of Surveys in Recorder

I think it may be more than a simple (possible cosmetic) alteration of the observations windows to include top level folders. To me, it's an extension to the data model - a top level "DATASET"?

This would have the advantage of being able to export a group of surveys which have a common theme. For example a data provider like CCW or one particular recorder who is running concurrent surveys.

I think is also important to be able to allow one survey to be a member of many "datasets".

I see a dataset as a set of pointers to surveys. I would think the modifications to Recorder would be another table to hold a list of grouped surveys and an iterative method of exporting all surveys within that group in one go. If the output format is zipped MDB would it make sense to have all of those exported surveys in one file? It would make the distribution and re-import of the data easier.  The import methods would need to be modified to handle multi surveys imports as well.

So, yes please - I'd like to vote for this feature too if it can be afforded.

Kind Regards.

Dave Cope,
Biodiversity Technology Officer,
Biodiversity Information Service for Powys and Brecon Beacons National Park.

4

Re: Grouping of Surveys in Recorder

I was thinking along the lines of what Dave expanded on, and I think his idea of being able to place surveys within many datasets sounds incredibly useful. So it would be a many-to-many relationship. This sounds a lot like tagging, rather than hierarchy, as found in online apps such as del.icio.us and mag.nolia when one URL can have many tags to describe it. This allows you to drill-down and isolate information quickly. For an idea of this in action have a look at my del.icio.us bookmarks filtered by the 'database' and 'SQL' tags.

Just thinking off the top of my head, here's an example use in Recorder: we have in our database records from Kent and from Surrey. While we need to use these for some tasks, they generally need to be ringfenced, not reported on and generally excluded from our day-to-day workings. Surveys from these areas could get 'tagged' with the keywords 'kent', 'surrey', 'not for reporting'. Our series of Sussex Ornithological Society surveys would get tagged with 'SOS', 'sussex' and 'not for export'. The surveys from Chichester Harbour Conservators would get tagged as 'CHC', 'sussex' and 'coastal', while their marine surveys would get an additional 'marine' tag.

Being able to tag surveys with keywords and then display them according to (multiple) tag(s) in the obs hierarchy and also export them, report, and so on, seems to me to be a sensible goal for the future, if data in Recorder are to remain managable. The point is, we have groups of many surveys that fit into many categories for different purposes and no way of grouping them at present.

Extending the idea further, the tags could have attributes themselves, such as a confidential attribute and a not for export attribute. I say this because, at present, there is no way of easily excluding data from an export, which is crucial functionality. E.g. If I do a polygon report on an area and want to export the results, but exclude the bird data only, how can I do that? Equally, if I want to extract all of the bird data, but exclude Kent and Surrey, how can I do it?

Finally, I would say that I would say that the tags/keywords should be exportable.

So, to sum up:

* Surveys to have multiple 'tags'
* Obs hierachy able to filter its display based on one or more tags
* Report wizard able to include/exclude based on one or more tags
* Export all surveys with a set of specified tags
* Tags are exportable

Whew, that's enough brainstorming for now -- it's too hot for any more!

Charles

Charles Roper
Digital Development Manager | Field Studies Council
http://www.field-studies-council.org | https://twitter.com/charlesroper | https://twitter.com/fsc_digital

5

Re: Grouping of Surveys in Recorder

Yes, my original idea was for a simple folder hierarchy to hold/organise Surveys in the window. Exportability (of the hierarchy) was not something I had considered as I would want to fit any imports into my own hierarchy. Exportability would then require importability and presumably multiply the cost/time required but would probably be useful, so I would say yes if it doesnt make the cost rocket. Or simple version first, extras maybe later?

Looking at Dave's suggestions, could queries/reports/exports be done via the Report Wizard>Restrict report to Sources>Surveys dialogue boxes? Possibly with a drop down there for "Survey Group" or whatever they might end up being called?

Gordon

Gordon Barker
Biological Survey Data Manager
National Trust

6

Re: Grouping of Surveys in Recorder

I think what Charles is hinting at goes beyond tags or a simple top level folder structure.  At the moment, the Thesaurus module developed for the Collections Module addin is only used in Recorder to add keywords (~tags) to references.  This allows you to search for references by keyword, and it supports synonymy and hierarchy (search for bats could find references tagged with chiroptera and pipistrelles for example).  It strikes me that all we need to do is enable the Thesaurus driven keywords to be linked to Surveys as well (and possibly other data types?).  Then, there needs to be a few options implemented as to how you use the keywords, tags or whatever you call them - for example:
1. Option to use top level folders for surveys based on their keywords (allowing you to export one of these folders).
2. Option to display a tag cloud to help you find surveys and references (see http://del.icio.us/tag/)
3. Option to search for tags/keywords instead of survey names.
4. Option to limit the report wizard output to data linked to one or more keywords.
and possibly some more.

The thesaurus model is ideal for this task because it is powerful, hierarchical, multi-lingual and supports complex relationships between the terms.  For example a survey of otters could use the thesaurus to populate the Show Related facility with other surveys such as surveys of river habitat.  I'm sure you can think of better examples!  It also allows you to add any number of facts or attributes to a tag - for example full descriptions, confidentiality flags or images.  However this facility is not yet supported without the Collections Module's Thesaurus Editor application installed.

Any comments?

John van Breda
Biodiverse IT

7

Re: Grouping of Surveys in Recorder

Yes, this sounds ideal John. I had a sneaking suspicion you might suggest the Thesaurus in regards to this, I just haven't used it and can't be sure of what it does and how it does it just yet. It's starting to make more sense now. So to expand on your examples and just mull them over a bit:

1. A top-level folder for containing surveys can be created by specifying one or more tags. Surveys with those tags attached to them get included in the folder. These folders can be then be exported, i.e. the surveys within them get exported.

2. I don't see the use of a tag cloud; do you have an example of what you envisage? Would the size of the tag be dictated by number or records? Or number or surveys? Or what?

3. Option to search by tag would be handy. Filtering, i.e. a persistent 'set' of tags, would be achieved via the tag folders idea mentioned above.

4. The report wizard should be able to output data based on one or more tags, but it should also be able to exclude data based on one or more tags too. This is crucial.

It would also be very useful if the export filter tool could include/exclude based on one or more tags.

Do others have any ideas on how sets of data, i.e. groups of surveys, could be used?

Given that the Thesaurus is a complex beast (at least it looks that way), is it going to make it much more difficult to write manual queries (for XML Reports and such) that make use of it?

Charles Roper
Digital Development Manager | Field Studies Council
http://www.field-studies-council.org | https://twitter.com/charlesroper | https://twitter.com/fsc_digital

8

Re: Grouping of Surveys in Recorder

So is that what the Keyword tab is for in the document window? We have begun to add linked documents such as warden reports etc & was wondering how best to implement this function; there's nothing in the help files relating to it, so I was wondering.

When you say it needs the Collections module to work, is it a redundant feature just now, like the 'Determiner Role' Validation competency tab present in term lists?

This is a feature I would very much like to be incorporated into the Individual hierarchy within 'Individuals and Organisations'. Essentially due to the fact that many of our recorders' expertise evolved as they progressed through survey/training teams.
So to label them with a single level of competency seems a little harsh. It would be far better to be able to say that "well some of his/her early records may be doubtful, however after 15 years in this field, he/she can now be considered a national expert".

9

Re: Grouping of Surveys in Recorder

A tag cloud would be a fun way of looking into a dataset and seeing what is there and how much of it.  This would become more interesting if tags were enabled on multiple data types, including surveys, references but also locations and so forth.  Its more of a data exploration tool than anything of real use!

The Thesaurus is actually probably a simpler beast to use than the collection of dictionaries and term lists that it replaces in the Collections Module.  Firstly, there is only 1 model rather than 4 to learn (taxon, biotope, admin area and term lists).  There are views to flatten the data so getting the actual term or common name of anything is simple (see VW_ConceptTerm and VW_ConceptTermCommon).  The data model is also designed so that tables such as Term_Version and Concept_Group_Version (roughly equivalent to Taxon_Version and Taxon_List_Version) only come into play when they actually need to, so are rarely involved in queries.  Therefore the path through the model from a to b tends to be much shorter.  Finally, there are no large index tables to maintain except the Concept_Lineage table (which replaces Index_Taxon_Group) - the simplicity of the model means Index_Taxon_Synonym and Index_Taxon_Name are not needed or are replaced by dynamic views.  It does get a bit complicated to use when you start looking at fact and relationship inheritance (e.g. fish have scales, therefore blennies have scales as well - you only need the single relationship at the higher level).

If you have a play with the Reference Keywords editor this gives you a very simple editor for a thesaurus list.  This doesn't support many of the really powerful features of the Thesaurus though.

John van Breda
Biodiverse IT

10

Re: Grouping of Surveys in Recorder

Data exploration, i.e. making data accessible and generally interesting and fun to explore is something that should never be overlooked. Making something a pleasure to use is what hooks users in and gives them the drive to learn more deeply about the software. Two weblogs I enjoy reading very much on these two subjects are Jensen Harris's Office User Interface Blog and Creating Passionate Users. So yes, tag clouds and new ways of exploring data get my vote. Would the various 'views' be a) exportable and b) reportable, i.e. flattened into a table for export to Excel, for example; or could a view be saved as a filter, so that the report wizard and export filter could include/exclude based on it?

Thanks for the explanation of the Thesaurus, I'll have a closer look at it.

Charles Roper
Digital Development Manager | Field Studies Council
http://www.field-studies-council.org | https://twitter.com/charlesroper | https://twitter.com/fsc_digital

11 (edited by davec 19-07-2006 14:18:42)

Re: Grouping of Surveys in Recorder

John, Charles et al,

Some interesting ideas for sure. I'm not familiar with the thesaurus in 6, but are we talking about a form of full text search? (on "tags", "keywords").

Being able to tag (via user defined words) data is a very powerful technique, as is of course adding in the actual data itself. In postgreSQL we use the TSearch2 module to provide this facilty. Sorry if this is going a bit of topic, but as we seem to be talking about data exploration could I illustrate with some code. (Which is all work in progress BTW).

Once a table has a column of type tsvector;

CREATE TABLE search ( vector tsvector)

and is indexed;

CREATE INDEX search_index ON search USING GIST(vector)

it is then just a case of "pouring in" data/keywords etc:

INSERT INTO search VALUES (to_tsvector('Some data you want to find later'))

The example below is a query based on the INDEX_TAXON_NAME table which I've migrated to PostgreSQL and also made into a seperate
full search table. The full text search returns a set of keys to join back to the index_taxon_name table in PostgreSQL.

SELECT
    DISTINCT actual_name, common_name, preferred_name, abbreviation, rank(vector,q) * 10 AS rank
FROM
    index_taxon_name, index_taxon_name_fullsearch, to_tsquery('Fox&Moth') AS q
WHERE
    vector @@ q
AND
    index_taxon_name.id = index_taxon_name_key
ORDER BY
    rank(vector,q) * 10 desc, preferred_name

Data Returned;
actual_name, common_name, preferred_name, abbreviation, rank
===========================================
"Fox Moth","Fox Moth","Macrothylacia rubi","fomot",0.99103219807148
"Macrothylacia rubi","Fox Moth","Macrothylacia rubi","marub",0.99103219807148

The above query for "Fox" and "Moth" returns two records in 1.4secs, a query on just "Moth" returns 453 in 12 secs. The vectored search table has actual_name, common_name, preferred_name and abbreviation contained within it and has 1,250,380 rows.

If this is the sort of technology John mentioned as the thesaurus then it would be great to see it extended in Recorder.

(Edited to remove typos due to heat stroke!)

Regards,

Dave Cope,
Biodiversity Technology Officer,
Biodiversity Information Service for Powys and Brecon Beacons National Park.

12

Re: Grouping of Surveys in Recorder

Full text search is a different issue to tagging - full text search is designed to let you find words or phrases in data which you haven't "prepared".  Tagging allows you to categorise your data so you or other users can navigate the data easier later.

Full text search is available in SQL Server as well as postgresSQL but not in SQL Server Express.

John van Breda
Biodiverse IT

13 (edited by davec 19-07-2006 15:43:15)

Re: Grouping of Surveys in Recorder

I did wonder if SQL Server had FTS as well, John. Thanks for the clarification on tagging. Just a couple of questions/thoughts;

How do other users know what tags you have assigned - Would it depend on a visual representation Tag Cloud - which is quiite nice and would there be a place in the system there the tag ranks could be found. I'm thinking here of external software that links to Recorder.

Is it be possible to do bulk tagging based on a search criteria?

Dave Cope,
Biodiversity Technology Officer,
Biodiversity Information Service for Powys and Brecon Beacons National Park.

14

Re: Grouping of Surveys in Recorder

Hi Dave

As we're just discussing ideas here, I don't know that I can answer your questions other than to say that there could be a way of bulk tagging and also to explore existing tags and ranks.  I suppose it really depends if the idea is taken up and funded.

John van Breda
Biodiverse IT

15 (edited by davec 20-07-2006 08:34:02)

Re: Grouping of Surveys in Recorder

johnvanbreda wrote:

Hi Dave

As we're just discussing ideas here, I don't know that I can answer your questions other than to say that there could be a way of bulk tagging and also to explore existing tags and ranks.  I suppose it really depends if the idea is taken up and funded.

Thats true! I guess I was thinking out aloud really. ;)

I've also been thinking more about tagging as I see some crossover with FTS, e.g. an FTS table of tags would could easily return frequencies to build into a tag cloud. I'll work on this tagging lark in our Merlin IMS (which links to Recorder) as FTS is a key part of that and it sound's like fun! :cool:

Dave Cope,
Biodiversity Technology Officer,
Biodiversity Information Service for Powys and Brecon Beacons National Park.