1 (edited by JamesPerrins 05-11-2008 10:40:49)

Re: Speed enhancements to the NBN gateway

Hi,

I have been asked to undertake a bit of work looking at how the webservices may be speeded up a bit.  This is broadly looking at 2 areas:

1) Alterations to the existing web services so that the data that is actually wanted can be asked for and returned (there are a number of circumstances where more data has to be requested than is actually wanted - and then filtered down on the client side)

2) Looking at identifying any bottlenecks within the webservices themselves working with the gateway team and seeing if we can do anything about them

What we are trying NOT to do - is to get dragged down the path of discussing lots of additional functionality for the webservices (unless it has a direct speed implication), though I will try and capture any of these that are raised so they can be targetted in the next phase of development

If anyone has any ideas - especially on point 1 above - I would be most interested to hear from them.  To set the ball rolling I've listed below the ideas that have already been raised.  Additional detail on these is available in http://forums.nbn.org.uk/uploads.php?fi …  Forum.doc

I look forward to any comments - please feel free to query me if you have any questions

Best wishes
James Perrins

1    Improving speed of existing web services (and related)   
1.1    Compress the soap response from tomcat   
1.2    Set a maximum allowable return size   
1.3    Allow an option switch to return just the map   
1.4    Improve filtering options on the request   
1.5    Extend / enhance Discovery services – Species List   
1.6    Additional discovery services   
1.7    Bulk download option   
1.8    Improvements to metadata schema (first loaded and last updated)   
1.9    Return dictionary data from taxonomy web service even if there are no records for this taxon on the Gateway.   
1.10    Links to metadata   
1.11    Consider returning empty result set instead of SOAP error when there are no data   
1.12    Tracking implementations of web services - Web services registration
1.13    WMS / WFS implementation   
1.14    Web services version control   

2    Additional functionality requested   
2.1    Grid mapping against all stored boundaries   
2.2    Latitude and longitude in response
2.3    Include species designations in taxonomy web service
2.4    GIS and other client tool development
2.5    Include map legends with map images
2.6    Occupied grid squares for a species to assist validation clients
2.7    Geographic metadata mapping
2.8    User defined colours on maps   
2.9    Basic access authentication
2.10    Taxonomic information for all taxa in the species records response xml
2.11    List of sites that a species occurs in (BARS)
2.12    Allow addition of new species to the gateway using the taxonomy discovery service
2.13    For point and buffer return distance from point to record
2.14    Allow geographic extent limit to be extended for single species requests

3    Server side optimisation

2

Re: Speed enhancements to the NBN gateway

Yes, definitely agree with you there Dan.

Charles Roper
Digital Development Manager | Field Studies Council
http://www.field-studies-council.org | https://twitter.com/charlesroper | https://twitter.com/fsc_digital

3

Re: Speed enhancements to the NBN gateway

Hi James

Speed:
We are still only using the web services for Taxonomy searches and as you will probably have guessed from my earlier posts elsewhere in the forum, the following options are particularly exciting;
1.4 (Improve filtering options on the request) ~ e.g. filter on category (or categories or exclude categories)
1.5 (Extend / enhance Discovery services – Species List) - I assume this would be something like  Kingdom/Class/Order... (as a drop down option pulled through from web services?)
.....or even a combination of the above two should speed up the queries but also provide more more functionality for users.

Can you elaborate a bit more on 1.6 (Additional discovery services)?

We would not want to limit dictionary data just to Gateway available data (1.9) as we use the Taxonomy results to then query our own data but this could be an optional switch (i.e. if people only want to retrieve data from the Gateway) - I assume that is what you were thinking

Functionality:
Not sure, are you also considering enabling Taxonomy searches based on Nature Conservation designations (e.g. IUCN) - ie extend the use of the 'Designation' attribute?
   
Cheers

Nick

(ePlanning Project Manager) Aberdeenshire Council

4

Re: Speed enhancements to the NBN gateway

Hi Nick,

Thanks for the reply - detail on your points below

The idea is to extend the taxonomic levels taht can be filtered on - though apparently you rapidly spiral off into taxonomic hierarch issues - not techincally difficult for the webservices to implement - but some question as to whether the dictiornaries are upto it !

I would hope that whatever gets implemented that as a minimum the possibility to submit multiple taxon version keys, or user defined designations or improved taxonomic designation will allow us to get more tightly defined lists back - to be honest any one of those 3 would give you reasonable control - though of course for one application or another it might be desirable to use different criteria

1.4    Improve filtering options on the request
Currently filtering is possible on one species, a group, a designation (now really a server side list) or a dataset.  This often means that much more data is requested from the server than is actually wanted.

•    Allowing the request to handle multiple species would greatly improve the efficiency of some client applications (which currently either have to request repeatedly for each species or request all the species and filter on the client side) both of which are very inefficient.

•    Additionally, the groups could be extended to allow a greater range of taxonomic levels.  Currently there are fixed groups that can be filtered for – and these don’t always meet requirements. Not sure if the dictionary is good enough to support this reliably through the TaxonAggregate table.

•    Thirdly the handling of server side lists should be extended.  Currently a web services request can specify that the response should only contain data for BAP species, species of conservation concern and a limited number of other lists. We are limited by the number of lists that we can currently manage (32).  The list identifier should not be an enumerated type to make the addition (and consumption) of new lists possible without having to re-consume the WSDL and rebuild proxy classes.  This could be extended to cater for an arbitrarily large number of filters. Web services users could specify the taxa in a filter to the Gateway service team, who would then add the filter to the database.  As an addition an interface could be designed for user management of these lists.  This would enable clients to actually just get back the data they want !

1.5    Extend / enhance Discovery services – Species List
The species list discovery service is very useful to get a quick summary of what exists in a particular area.  For many applications this is not quite enough, forcing users to request the complete list of detailed species records simply to build a summary in the client side

Add an option switch to additionally return:
•    Number of records for each species
•    Date species last recorded
•    Whether download permission exists for this species at the current access level (at the moment you often drill down to discover that there are no detailed records available)

An additional option switch might be used for the spatial data to return a list of unique positions for each species (with the option to generalise the accuracy upto 1km etc).  This would allow many species distribution type applications to get all the data without having downloaded the detailed species data – thus keeping the response size as compact as possible


1.6    Additional discovery services
Discovery services are missing for:
•    known site boundaries - held on the NBN Gateway. Includes SSSI, ASSI, SPA, SAC, Vice County, RAMSAR, NNR sites. The site ID and provider ID are the values you should use if filtering by known site.
•    taxon groups - The taxon group keys are those you should use when filtering on a taxon group
•    designation types - The taxon designation keys are those you should use when filtering on a taxon designation.

Currently these have to be downloaded from the gateway website as a spreadsheet.

Hope this clarifies some of the points
Best wishes
James

5

Re: Speed enhancements to the NBN gateway

Dan / Charles

Sorry an earlier post of mine seems to have vanished / not made it :(

Basically I agree and think that improving teh guidance on attribute submission would be good.  However I know having looked at some of teh mechanics of how the gateway works that this would mena another quite large one to many table join at the filtering stage - and what they are quite keen to do (or rather not have to do) is a group by query at the end of it all to get back to unique records.

So filtering on the prescence of one attribute might be relatively easy - but filtering on multiple attributes (or excluding records with some) might be a bit harder as it would need a bit of a rejig (or some relatively slow post processing - which might be Ok if by that point the number of records had been reduce to a sensible number)

Best wishes
James

6

Re: Speed enhancements to the NBN gateway

Hi James

Sorry, probably did not make it clear that I was refering to the Taxonomy web service http://data.nbn.org.uk/library/webservi … xonomy.jsp - which does not currently have other options for refined searches but it would be helpful if it did

Cheers

Nick

(ePlanning Project Manager) Aberdeenshire Council

7

Re: Speed enhancements to the NBN gateway

Hi Dan,

Happy New Year to you too.

I got a draft to the gateway team end of Nov - and am now just finishing off final report, but I'm afriad I'm the wrong person to ask in terms of what happens next - suggest you ask Jon Cooper - unless he jumps in here with a reply.  I believe they were using the draft to come up with a development plan which could be starting shortly I guess, but I haven't seen any completion dates mentioned yet.

As it happens I'd quite like a couple of them just at the moment too !

Best wishes
James

8

Re: Speed enhancements to the NBN gateway

Hi Dan and James,

A Happy New year to you both. 

We've been through James's web services review (me, Steve W, Geoff Johnson, Nicole and Richard) and adjusted the priorities a bit.  I'll post this on the forum next week.  I'll be scoping the tasks next week (already started some of them) and then put together my work plan so you can see when to expect things.  This work is due to be completed by the end of the financial year.

What I want to do is use this forum to keep you up-to-date with how the work is going and to publish functionality onto www.testnbn.net where we will be testing it and you can test it if you want to (which would be really useful).

More details next week.

Cheers, Jon

9

Re: Speed enhancements to the NBN gateway

Hi Dan,

the first thing that will come on stream (on nbntest.net) will be the new user defined polygon.  It won't bring any functional change - its the same wsdl, but the methods under the bonnet are very different.  A bonus of this is that bringing in the increased designation filtering and user defined species lists will be easier for me to implement and robust.  Should have it on testnbn next week, so I'll let you know.

FYI I'm currently working on these in this order for end of March:
- user polygon querying optimisation
- simplify designation filtering - ie so we can quickly load as many defined species lists as required onto the Gateway
- allow the user to supply a list of species in the request for filtering
- OneSiteData - map only option and switch to exclude dataset specific attributes (eg abundance, etc)

I appreciate that's not everything on the desired list - do your summary reports depend on the species level summary information?

Cheers, Jon