1

Topic: REST API character set?

Which character set are the REST API responses in? (eg UTF-8, latin1 etc)

I didn't find any mention in the documentation (maybe I missed it).
The HTTP headers don't include it (eg in the Content-Type header).
The JSON specification says it SHALL be Unicode, default UTF-8 (Section 3 of RFC4627)

When decoding reponses as UTF-8 I sometimes get decoding errors - it looked like the API was returning latin1 characters.

I am currently decoding as latin1 but get strange results when it comes across UTF-8 encoded characters.

Which character set should I expect from the API?

Thanks,

Chris.

BTW I included a link to the RFC page at the IETF but upon submitting was warned...

  "Too more links in message. Allowed 0 links. Reduce number of links and post it again."

It seems a shame that I can't cite my references in a technical forum :(

2 (edited by matt.debont 22-07-2014 08:02:31)

Re: REST API character set?

Can you give me an example of the API returning latin1 characters, we should be UTF-8 as we don't touch the encoding from the database through to the user, so I would be interested to see if we have an issue with our JSON encoder.

Also I didn't think we had a URL restriction, I guess you would have to contact a forum admin for further information, you should not have any issues pasting in plain text rather then trying to make a link as far as I know.

Thanks,
Matt

Matt Debont
Application Developer
Joint Nature Conservation Committee, Monkstone House, City Road, Peterborough PE1 1JY, UK

3

Re: REST API character set?

Thanks for confirming that I should expect UTF-8.

It would be useful (at least for debugging) if you could include the charset in the Content-Type header so browsers and apps don't have to guess (for me, firefox uses UTF-8, chrome uses latin1).

Whilst trying to get you a nice example I've discovered that something at my end is correctly decoding it as UTF-8 and then re-encoding it as latin1 before giving it me, so it's my problem rather than yours :(

Here is an example...

   /api/taxonObservations/230141303

The authority is Brünnich (the 'u' has an umlaut). Your server is returning 0xc3 0xbc which is the correct UTF-8 encoding for U+00FC.

Thanks,

Chris.

BTW I included the links as plain text - the forum app added 'url' in square brackets and then said I couldn't use links! I've removed the scheme and host to see if that'll go through.

4 (edited by matt.debont 28-07-2014 14:09:51)

Re: REST API character set?

No problem I will see if I can get the API to respond with the correct content-type header in the future, I may not be able to get around to this quickly, but it will get done. Please let us know if you run into any further issues however as we are happy to help where we can.

If you are having issues with the forum however you will need to contact the forum admins as I don't have any control here.

Thanks,
Matt

Matt Debont
Application Developer
Joint Nature Conservation Committee, Monkstone House, City Road, Peterborough PE1 1JY, UK