1

Topic: export from datawarehouse

Good afternoon,
is there a way to export occurrences and samples from datawarehouse so to import in an other one? I need to migrate some data from a warehouse in the new one without reimporting all the DB. Any idea or help? Thanks

Carlo Politi - Software Engineer
QuestIT s.r.l.
Home page: www.quest-it.com
E-Mail: politi@quest-it.com - Skype: carlo.politi.questit

2

Re: export from datawarehouse

Hi Carlo,

The export can be easily achieved by writing a report that includes all the fields you wish to copy. There is good documentation on writing reports. Having written your report you can run it through the warehouse Report Browser (Entered Data > Reports). It will display a page of results in a table on your screen. Using browser tools, capture the url of the request that is sent when the report is run. You will see a parameter, mode=json. Change this to mode=csv. Use the limit and offset parameters to control the size of the download, breaking it in to chunks if it is huge. By pasting the modified url in to your browser it will download a csv file of records.

You can upload csv files directly in the Occurrences page of the warehouse (Entered Data > Occurrences). At the bottom of the page there are buttons to allow you to browse for a file and then upload it. I know it will import all the core fields of sample and occurrence. What I am unsure about is importing custom attributes.

I suggest you conduct an experiment to see if you can write a csv file of just two or three occurrences which will import all the data you require. If that works you can then write the report which will generate the csv file format containing the full dataset.

There may be another way. There are a set of web services for the exchange of data. I know nothing about these but it might be an option.

Jim Bacon.

3

Re: export from datawarehouse

Jim Bacon wrote:

Hi Carlo,

The export can be easily achieved by writing a report that includes all the fields you wish to copy. There is good documentation on writing reports. Having written your report you can run it through the warehouse Report Browser (Entered Data > Reports). It will display a page of results in a table on your screen. Using browser tools, capture the url of the request that is sent when the report is run. You will see a parameter, mode=json. Change this to mode=csv. Use the limit and offset parameters to control the size of the download, breaking it in to chunks if it is huge. By pasting the modified url in to your browser it will download a csv file of records.

You can upload csv files directly in the Occurrences page of the warehouse (Entered Data > Occurrences). At the bottom of the page there are buttons to allow you to browse for a file and then upload it. I know it will import all the core fields of sample and occurrence. What I am unsure about is importing custom attributes.

I suggest you conduct an experiment to see if you can write a csv file of just two or three occurrences which will import all the data you require. If that works you can then write the report which will generate the csv file format containing the full dataset.

There may be another way. There are a set of web services for the exchange of data. I know nothing about these but it might be an option.

Jim Bacon.


Hi Jim,
i have written an xml report to extract all occurrences from half december 2016 in this way, from "recent_occurrences_in_survey.xml". But this gets only occurrences, right? I need also samples, right?

<report
    title="Recent Observations in Survey QuestIT"
    description="Lists the most recent recordings in a specified survey."
    row_class="biota"
>
  <query>
  SELECT #field_sql#
  FROM samples s
  JOIN occurrences o ON o.sample_id=s.id
  LEFT JOIN occurrence_images oi ON oi.occurrence_id=o.id
  JOIN surveys su ON s.survey_id = su.id
  JOIN list_taxa_taxon_lists lttl ON lttl.id=o.taxa_taxon_list_id
  LEFT JOIN locations l ON l.id=s.location_id
  LEFT JOIN users u ON u.id=o.verified_by_id
  WHERE o.record_status NOT IN ('I','T') AND date_start >='2016-12-15'
  #order_by#
  </query>
  <field_sql>
  su.title, su.website_id, o.id as occurrence_id, s.date_start, s.date_end, s.date_type, lttl.taxon,
  s.entered_sref, l.name as location_name, oi.path,
  CASE o.record_status
    WHEN 'C' THEN 'Data Entry Complete'
    WHEN 'V' THEN 'Verified'
    WHEN 'R' THEN 'Rejected'
    WHEN 'I' THEN 'In Progress'
    WHEN 'T' THEN 'Test'
    WHEN 'S' THEN 'Sent for verification'
    ELSE ''
  END AS record_status,
  u.username as verified_by,
  su.id
  </field_sql>
  <order_bys>
    <order_by>date_start DESC</order_by>
  </order_bys>
  <columns>
    <column name='title' display='Survey' />
    <column name='website_id' visible='false' />
    <column name='occurrence_id' visible='false' />
    <column name='date' display='Date' orderby='date_start' />
    <column name='taxon' display='Taxon' class='sci binomial' />
    <column name='entered_sref' display='Spatial Ref.' />
    <column name='location_name' display='Location' />
    <column name='path' display='Photo' img='true' />
    <column name='record_status' display='Record Status' />
    <column name='verified_by' display='Verified By' />
  </columns>
</report>

Carlo Politi - Software Engineer
QuestIT s.r.l.
Home page: www.quest-it.com
E-Mail: politi@quest-it.com - Skype: carlo.politi.questit

4

Re: export from datawarehouse

Hi Carlo,

Your report joins samples and occurrences to create a single flattened table of results. This is the correct thing to do. If one sample has multiple occurrences then there will be multiple rows for that sample in the output.

Provided all the rows for one sample are together in the csv file then they will be imported as a single sample. For this reason I would change the order_by to be the sample id.

Remember the cache tables combine a lot of information together making queries easier to write and quicker to execute. This is a good start though.

Jim Bacon.

5 (edited by carlo_politi 31-03-2017 14:16:34)

Re: export from datawarehouse

Jim Bacon wrote:

Hi Carlo,

Your report joins samples and occurrences to create a single flattened table of results. This is the correct thing to do. If one sample has multiple occurrences then there will be multiple rows for that sample in the output.

Provided all the rows for one sample are together in the csv file then they will be imported as a single sample. For this reason I would change the order_by to be the sample id.

Remember the cache tables combine a lot of information together making queries easier to write and quicker to execute. This is a good start though.

Jim Bacon.

Hi Jim,
thanks! I haven't tried to write the other report and to import, so my next question could be silly: i have more survey in this result list and in list i have the "su.id". When i import, is there a way to specify which these result belongs to? consider the result is of occurences is max 20 rows

Carlo Politi - Software Engineer
QuestIT s.r.l.
Home page: www.quest-it.com
E-Mail: politi@quest-it.com - Skype: carlo.politi.questit

6

Re: export from datawarehouse

I have checked my net in browser but i see only this possible call:

http://postgres.comune.grosseto.it/Indi … ey_questit

and in params i see
report_name=recent_occurrences_in_survey_questit

and no other params... i tried to add "mode=csv" or "mode=json" but i always get an html page with table results...

Carlo Politi - Software Engineer
QuestIT s.r.l.
Home page: www.quest-it.com
E-Mail: politi@quest-it.com - Skype: carlo.politi.questit